First, there are 3 video codecs involved here.
H.323 _requires_ H.261 if your endpoint supports video at all. Unfortunately, H.261 is so old and so low-resolution that very few H.323 capable endpoints today actually support it as they should. This requires the most bandwidth, but requires the least CPU to encode/decode.
There is no H.262. (Well, there is, but it simply isn't used by telepresence endpoints). I think you may have misunderstood your Sorenson agent, or your Sorenson agent was misinformed.
H.263 is the video codec that has been supported by all deaf video phones since NetMeeting in the early days of VRS. It has been the defacto video codec for all deaf video phones as everyone supports it. This requires slightly less bandwidth than H.261 for the same resolution/framerate, but it also requires a bit more CPU to encode/decode. There is no requirement for any H.323 endpoint to use H.263, it is entirely optional. On the other hand, it has universally been available in all VRS videophones up to the Sorenson nTouch endpoints. More on this below.
H.263 RFC2190 payload type 34 is the most compatible video codec negotiated between endpoints simply because most endpoints support it.
There is also H.263+ (aka H.263 1998 or H.263v2) and H.263++ (aka H.263 2000 or H.263v3) which use a dynamic payload type, and have aspects of H.264 integrated into them. Many endpoints also support these as well, but not all do.
H.264 is the "new" codec supported by newer endpoints. It uses the least amount of bandwidth for the same resolution/framerate, but it also requires an order of magnitude more CPU power to encode/decode. There is no requirement for any H.323 endpoint to use H.264, it is entirely optional.
There are many H.264 "profiles". This includes RCDO (reduced complexity decode option), baseline, main, and a mess of "extensions". There is a difference between H.264 AVC (Advanced Video Codec), and H.264 SVC (Scalable Video Codec), the latter of which is used in the newest of endpoints from Vidyo and Google Talk.
The key thing to realize with H.264 is that there are many different profiles, and even more proprietary extensions. Simply put, getting two endpoints to negotiate a flavor of H.264 that each other likes is not a trivial action, and is fraught with interoperability negotiation problems and bugs.
Presently, the Mirial mobile device software videophone used by most of the VRS providers only supports H.263.
But, again, this should be ok. All VRS endpoints up to this point had supported H.263.
Sorenson has effectively broken H.263 in their nTouch PC version. The RFC2190 RTP media generated by their nTouch PC version is playable by other standard H.263 capable endpoints, but their PC version _refuses_ to play the H.263 video sent to it. This means that the nTouch PC shows black video, but whomever they call sees the video just fine.
Sorenson has also disabled H.263 entirely in their nTouch mobile version. It _only_ does H.264. This alienates all of the H.263 capable deaf video endpoints.
As to SIP vs H.323, don't get me started. That's about signalling and codec negotiation, and is a topic into itself.
The FCC requires no video codec. Of any kind. If they were wise, they would require H.263. But that's not what the FCC does, or at least not yet. Someday, maybe.
I just did a quick google, and I found this webpage that shows which codecs each endpoint supports:
http://www.salyens.com/interop/index.html
Notice how most of these videophones suport H.263?
Until there is some kind of government regulated requirement, VRS providers should be trying to build endpoints that are backward compatible. It's kinda impossible to be _forward_ compatible.
For example, it's not like we have the source code to the Dlink DVC1000 (what Sorenson put out as the VP100) to add H.264 to that videophone. That's right, a Sorenson VP100 can't call an nTouch. Don't you find that rather odd?
Please ask more questions, I'm glad to answer them in as much engineering detail as you can bear