Around IT in 256 seconds

#74: SOAP: (not really) Simple Object Access Protocol

May 16, 2022 | 3 Minute Read

SOAP, formerly known as Simple Object Access Protocol, is a messaging standard. SOAP is very broad and general. Technically, it can support request-response, as well as fire-and-forget communication. The underlying protocol is typically HTTP, but there’s nothing against using message brokers. Or even good old SMTP. You know, the one for exchanging e-mails. The communication happens through XML messages. These messages are well-defined and structured. XML schema is agreed upon before any communication.

In SOAP, everything starts with a schema. XML has a very detailed standard for defining valid schema. So each message, both request and response, is well-defined. Even possible faults are described. We know in advance which fields to expect, what is their type, etc. If you send a malformed request, it’ll get rejected straight away. Malformed response, even if it reaches your client, will get rejected as well.

This gives us a great deal of confidence. Contrast that to RESTful APIs. They are loosely defined and should be discovered, rather than described in advance. GraphQL, on the other hand, realized that schema is a good thing. A machine-readable contract is better than documentation or convention. Sadly, it makes API evolution harder.

SOAP messages are then grouped into operations. This happens in yet another XML file, called WSDL. Web Services Description Language. The last version of this standard is 15 years old. It’s either perfect or abandoned. Hmmm…?

Anyway! With WSDL you define an operation, like GetAccountBalance. The operation has input (request) and output (response). Both are XML documents. Now, here’s the bizarre part. SOAP is typically tunnelled through HTTP. However, it only uses HTTP POST requests. And a single endpoint for all operations. So even GetAccountBalance, clearly idempotent and read-only operation, uses POST.

Compare that to RESTful APIs. At its core REST embraces all HTTP verbs and URLs. So HTTP GET /balance endpoint, for example. GraphQL, on the other hand, works exactly like SOAP. Single endpoint and POST only.

One more peculiar feature of SOAP - it’s a protocol inside a protocol. What do I mean by that? Well, each SOAP request going through HTTP has its envelope and a bunch of headers. As if HTTP didn’t have them already. XML request is wrapped in an XML body that, together with an XML envelope, is wrapped in a SOAP message. Just as if XML wasn’t verbose enough. This weird specification technically allows tunnelling SOAP through e-mails or text messages.

RESTful APIs traditionally use JSON, a much more compact format. But they should support content type negotiation. GraphQL has standardized JSON as well. These days we hear a lot about binary protocols, that support schema as well as good compression.

SOAP has so much more to offer. There’s:

  • SOAP with Attachments
  • SOAP-over-UDP
  • WS-Addressing
  • XML-binary Optimized Packaging
  • WS-Security
  • WS-Policy
  • WS-SecurityPolicy
  • UDDI

Years later, SOAP almost disappeared and if you still have to use it, I feel sorry for you. However, it’s worth understanding why it became so popular a decade ago. And what REST and GraphQL could learn from it. It’s actually surprising how similar GraphQL is to SOAP. As always, history likes to repeat itself.

That’s it, thanks for listening, bye!

More materials

Tags: GraphQL, RESTful, SOAP, WSDL

Be the first to listen to new episodes!

To get exclusive content: