If you are familiar with DF terms, I am a dabbling Programmer.
There's a link to the video on the first page. I have a colossal 30 second clip that I'll try to downconvert to something portable, but take a look at the earlier clip to see some of the stuff you can do with some parameters.
I'd be glad to share what I've got.
The key thing is the noise generator- a random heightmap or fractal that- and this is key- has consistent slopes. This is opposed to a really noisy heightmap where each pixel has a random value with no connection to adjacent ones. That is:

(Note that red is extreme highs, blue extreme lows, Greens for elevation lines.
Then, you scale it down, shift it and add it to original pattern so that the larger noise has finer details from the smaller noise. This process can be repeated several times so as to have many levels of detail. In my map I have two levels for most terrain features.
Then we can simply create a slightly different grayscale for each terrain type, then cause interactions between terrains (for instance, add 0.2 to swamp when elevation is at 0 and subtract 0.2 from it at elevation at 1). Once you have the final grayscale map, at any point that is higher than X, make it 1, else 0. The effect is that you'll have this nice, organic looking terrain borders like I have above. Some terrain might need more detail (for instance, the high-contrast change from water to land means that there should be one or two extra layers for it on the greyscale. X can be selectable, and small changes can cause more or less of the relevant terrain to be present. Finally layer each terrain pattern on top of each other and there's your map.
Now, for a better system, you could compare all maps at each point and select what is most intense at any point rather than just culling, but I don't know how that would look for you.
Here's the "source code", as it's built in art of illusion, so it's not getting out of there.

Trust me, that's actually pretty good for what we're working with.
Oh, there's one more thing that I've discovered, and that is that real terrain has a non-linear, non-sine distribution- that is, most terrain is slightly above or below water, with exponentially less terrain at higher elevations. I figured out that if you're starting with a noise patten that's linear, like the one I have at the top of this post, that you just need to decrease the prevalence of extreme values and increase common ones, something like an inverted sine curve. for the actual data (
which isn't all that useful, but interesting nonetheless).
Heh, I didn't realize it, but I invented
procedural brownian motion...