It's not so simple... especially since there are many ways to go about it, depending on the requirements of the game you are generating the map for and your own preference of the end results.
But if I had to boil it all down, first I would use Perlin noise to generate a topographical map, then modify (multiply) the resulting values by a modifier determined by the Voronoi value at each position.
Understand that all (2d) noise is, for any given given input (x/y coordinate set), a value is returned based on the noise algorithm used. So if you have apply Perlin noise in an area, every coordinate will have a value that corresponds to its elevation, or whatever else you desire. Those same coordinates will also have a different value according to Voronoi noise.
So lets say if you decide all Voronoi cells with a value of less than .5 is an ocean, you would first get the elevation of x/y from the perlin noise, then check that if that x/y is less than .5 in Vornoi noise. If it is, multiply by .1, thus the final value will end up "lower" than the cells that are not oceans.
Layering additional noise is similar. Lets say you don't want such a sharp cutoff between land and ocean. Notice cellular (Worley) noise has the same borders and cell shapes as the Voronoi noise, but lower values towards the center of the cell, and higher values at the edges. You could invert this, and multiply the biome modifier from the Voronoi noise by the value from the cellular noise. The result would be that there is less of an effect near the edge of the cells, and a greater effect towards the center.
The returned values can be anything you want, not just elevation. For example, you could use Voronoi noise to determine "Are there trees at this coordinate or not?", or you could use classic (Perlin) noise to determine "How dense are the trees at any given coordinate?"