code-search API documentation

On this page

What the service does
Authentication
REST API
MCP (Model Context Protocol)
Reporting feedback
Error responses
Reporting issues

What the service does

The code-search service finds the right code (SNOMED CT, LOINC, etc) for clinical text, constrained by a FHIR context parameter (e.g. StructureDefinition#element) or a ValueSet URI. It layers intelligent matching over a FHIR terminology server's ValueSet/$expand: deterministic fast-path matching for common cases, with LLM evaluation and iterative search-term expansion for harder cases.

The intended use is to take clinical text whose meaning needs to be coded — for example, from a clinical note — and produce a code that satisfies the binding required by a FHIR resource element or your application's value set.

Meaning fidelity

The service picks the code whose meaning matches the input as closely as possible without adding meaning that isn't in the input. If the input says "thyroid scan" the service will not return a code that means "iodine-123 thyroid scan" — it would be inserting a method the clinician didn't write. If no exact-meaning code exists, the service falls back to the closest broader code that captures everything the input does say, and never to a narrower one. Returning a broader code is honest under-coding; returning a narrower code is fabrication.

Where a single code can't capture the full meaning, the response includes intersection_codes — secondary codes whose meanings, combined with the primary, encode what the text actually said. These are separate clinical concepts, not alternate codings of the same concept; under FHIR all Coding entries within a single CodeableConcept must represent the same concept (different terminology, same meaning). Callers should map intersection codes to the appropriate FHIR element for each — e.g. body site to Condition.bodySite, supporting evidence to Condition.evidence, secondary findings to a separate Condition resource — or, when the bound terminology supports it (SNOMED CT in particular), use a post-coordinated expression to encode the compound meaning in a single Coding.

FHIR binding awareness

The service understands FHIR's binding strength (required, extensible, preferred, example) and additional bindings on the bound element. It respects what each strength allows:

A required binding means "use a code from this ValueSet" — the service will not return a code outside it. If no in-VS match exists, the service returns no match rather than fabricating one.
extensible / preferred bindings mean "use a code from this ValueSet if one fits; otherwise pick something appropriate from the same code system". The service falls back to a sensible broader hierarchy in that case — e.g. for a SNOMED preferred imaging-procedure binding, it'll search the SNOMED imaging-procedure hierarchy when the bound VS has nothing.
binding.additional entries (R5) are honoured by purpose: maximum caps the fallback, required means the result must intersect with that VS too, preferred/extensible are augmenting suggestions. The service merges candidates from all relevant bindings and ranks accordingly.

Concrete example. Given the AU eRequesting ServiceRequest.code element for imaging requests, which has a preferred binding to the RANZCR Radiology Referral ValueSet:

Input "Radionuclide thyroid scan" → the term isn't in the bound RANZCR VS. A naive $expand against that VS returns nothing. This service detects the preferred strength permits fallback, falls through to the SNOMED imaging hierarchy, finds 385443001 Radionuclide thyroid imaging, and returns it — without choosing the more specific 763810005 Iodine-123 radionuclide thyroid imaging, because the input never said iodine-123.
Input "Chest X-ray" on the same element → in-VS match is found directly; no fallback needed.
Same input on an element with a required binding to a small enum (e.g. Condition.clinicalStatus) → the service returns a code from the bound enum or no match at all.

Why use this instead of `$expand` directly?

A naive client can call ValueSet/$expand?filter=... and pick the first result. That works for clean inputs against well-curated ValueSets. The cases this service handles that $expand on its own does not:

Abbreviations and shorthand: T2DM, HbA1c, FBE U&E LFT, EpiPen, Ventolin 100mcg. The service recognises these and resolves to the proper coded concept.
Consumer language: "sugar diabetes" → Diabetes mellitus; "ticker trouble" → Cardiac disorder; "blood thinners" → the anticoagulant concept.
AU and UK spelling, regional terminology: haemoglobin matches Hemoglobin; "Pred" in an AU prescribing context resolves to Prednisolone (not Prednisone, the US convention).
Multi-code-system support: SNOMED CT, LOINC (RCPA SPIA pathology, AU Core diagnostic-result bindings), and small required-binding enums (status / criticality / intent) for AU Core and FHIR core profiles — all behind a single API.
Binding-aware fallback (described above): respects the FHIR binding strength to decide whether to widen the search beyond the bound ValueSet.
No overstepping: the result will not add subtype, method, laterality, or other detail the input didn't carry. Better to return a broader honest code than a narrower fabricated one.
Returns nothing rather than guessing: when the input describes a patient theory or non-clinical etiology that no code can faithfully capture (e.g. "wifi triggering seizures", "tap water making me feel unwell"), the service returns an empty result rather than coding the symptom alone and mis-representing the input.
Reasoning on every match: each returned code carries a human-readable explanation of why it was selected. Useful for audit, for clinician review, and for training data.

Authentication

All requests require a JWT bearer token issued by our authorization server.

Authorization: Bearer eyJhbGciOiJSUzI1NiIs...

Since you're reading this you already have an account and are signed in. There are three routes to making authenticated calls, depending on the use case:

Use the demo app. The interactive demo forwards the bearer token from your portal session on every request. It's the easiest way to experiment with the API against the bindings and ValueSets the portal pre-populates — no token handling on your part.
Use your portal-session token directly for ad-hoc exploration or scripts run from your own machine. The same token your browser holds after signing in to the portal works as the bearer for direct REST requests. You can read it from the page's session storage if you want to drop it into a curl one-liner. Don't paste it into Claude Desktop or another MCP client — tokens shouldn't be persisted as plain config; the next route covers that case properly.
Connect an OAuth-aware MCP client (Claude Desktop, Claude Code, Cursor, Cline) by pointing it at https://code-search.australiaeast.cloudapp.azure.com/mcpBETA URL — the client will discover the auth requirements via the Protected Resource Metadata at /.well-known/oauth-protected-resource and run its own OAuth flow against the same authorization server you signed into. You'll get a "log in" prompt the first time; from then on the MCP client carries its own token, separate from the portal session. No copy-paste.
Request OAuth client credentials from ontoserver-support@csiro.au for system-to-system integrations. We'll issue a confidential client (client_id + client_secret) so your service can mint its own tokens via the client_credentials grant against the authorization server's token endpoint. Use this when the caller isn't a human in a browser session.

REST API

POST /api/v1/find-code

Find the best code for a clinical-text query.

Request body:

{
  "text": "type 2 diabetes",
  "context": "http://hl7.org.au/fhir/core/StructureDefinition/au-core-condition#Condition.code",
  "max_candidates": 3,
  "effort": "balanced"
}

Field	Type	Required	Description
`text`	string	yes	Clinical text to encode
`context`	string	one of context/url required	FHIR profile element with binding (e.g. `StructureDefinition…#Condition.code`)
`url`	string	one of context/url required	ValueSet canonical URL
`system`	string	no	Code system the result should be drawn from (e.g. `http://snomed.info/sct`, `http://loinc.org`). Default `http://snomed.info/sct`.
`system_version`	string	no	Pin a specific code-system version, forwarded to the terminology server as `system-version`.
`max_candidates`	int	no	Maximum number of candidates returned. Top-N by confidence, with ties at the boundary pulled in. Default 3.
`effort`	`"fast" \| "balanced" \| "best"`	no	How hard to try. `fast` = quick lookup, may bail early on hard cases. `balanced` (default) = full evaluation pipeline. `best` = more iterations on hard cases at the cost of latency.

Response:

{
  "matches": [
    {
      "code": "44054006",
      "system": "http://snomed.info/sct",
      "display": "Diabetes mellitus type 2",
      "confidence": 0.95,
      "reasoning": "exact match on preferred term"
    }
  ]
}

Field	Description
`matches[]`	Ranked candidates. `matches[0]` is the primary suggestion. May be empty if no plausible code exists.
`matches[].confidence`	0.0 to 1.0. Values ≥ 0.9 are typically usable without human review; values below 0.7 should be treated as suggestions and confirmed.
`matches[].reasoning`	Human-readable explanation of why this code was selected.
`intersection_codes[]`	When a single code can't capture the full meaning, additional codes whose intersection with `matches[0]` represents the complete meaning.

GET /api/v1/find-code

Same handler as POST, parameters in the query string. The endpoint accepts both forms because:

POST is the canonical form for integrations: a JSON body, no URL length limits, easy to construct programmatically.
GET exists for ad-hoc exploration and shareable debug links — paste a curl --data-urlencode line into a chat and the receiver can run it as-is, or open the URL in a browser.

Behaviour is identical between the two; choose by use case, not by capability.

curl examples BETA URL

The hostname below is for early-access testing only and will change before general availability.

# Replace $TOKEN with your bearer token

curl -s "https://code-search.australiaeast.cloudapp.azure.com/api/v1/find-code" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "type 2 diabetes",
    "context": "http://hl7.org.au/fhir/core/StructureDefinition/au-core-condition#Condition.code"
  }' | jq .

# GET equivalent
curl -s -G "https://code-search.australiaeast.cloudapp.azure.com/api/v1/find-code" \
  -H "Authorization: Bearer $TOKEN" \
  --data-urlencode "text=type 2 diabetes" \
  --data-urlencode "context=http://hl7.org.au/fhir/core/StructureDefinition/au-core-condition#Condition.code" | jq .

Reporting feedback

After automapping clinical text to a code, callers can report the outcome so it can become training and evaluation data. Three actions fall out of the same endpoint:

Accept — the service's code was correct; supply the same code as chosen_code (omit supplied_code or set it equal).
Correct — the service returned a code but it was wrong; supply both supplied_code (what the service said) and chosen_code (the right code).
Supply — the service returned nothing and a human provided the code; supply only chosen_code.

POST /api/v1/feedback

Authenticated (same bearer-token auth as find-code).

Request body:

{
  "text": "T2DM",
  "context": "http://hl7.org.au/fhir/core/StructureDefinition/au-core-condition#Condition.code",
  "system": "http://snomed.info/sct",
  "chosen_code": "44054006",
  "chosen_display": "Diabetes mellitus type 2",
  "supplied_code": "73211009",
  "supplied_display": "Diabetes mellitus"
}

Field	Type	Required	Description
`text`	string	yes	The original search text that was mapped
`context`	string	one of context/url required	FHIR profile element with binding (same as `find-code`)
`url`	string	one of context/url required	ValueSet canonical URL (same as `find-code`)
`system`	string	no	The code system that was searched (the supplied code’s system). Optional — defaults to `chosen_system` if omitted, else `http://snomed.info/sct`. Send it explicitly (alongside `chosen_system`) to record a genuine cross-system correction.
`chosen_code`	string	yes	The human-confirmed correct code
`chosen_system`	string	no	Code system of `chosen_code`; defaults to `system`. Lets a correction cross code systems.
`chosen_display`	string	no	Display term for the chosen code
`supplied_code`	string	no	The code the service originally returned. When present and different from `chosen_code`, recorded as a negative example.
`supplied_display`	string	no	Display term for the supplied code

Response — 201 Created:

{
  "id": "fb_01j8z...",
  "kind": "correction"
}

kind is one of confirmed (supplied == chosen), correction (supplied present but differs from chosen), or novel (no supplied code — human supplied where the service had nothing).

MCP (Model Context Protocol)

The service exposes an MCP streamable-HTTP endpoint at /mcp with one tool: find_code. Use it from any MCP-aware client (Claude Desktop, Claude Code, Cline, etc.).

Claude Desktop configuration BETA URL

Claude Desktop talks to MCP servers over stdio. To bridge that to our HTTP endpoint we use the mcp-remote npm package — it runs in-process, handles OAuth Protected Resource Metadata discovery, and pops a browser the first time you connect. Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "code-search": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://code-search.australiaeast.cloudapp.azure.com/mcp",
        "--static-oauth-client-info",
        "{"client_id":"code-search-mcp"}"
      ]
    }
  }
}

Save, fully quit Claude Desktop (⌘Q on macOS, not just close the window) and reopen. The first time the model invokes find_code a browser tab opens against the code-search portal for sign-in; after that the token is cached by mcp-remote locally and refreshed automatically. No bearer token in the config file, ever.

The --static-oauth-client-info flag tells mcp-remote to use our pre-registered code-search-mcp public client instead of attempting Dynamic Client Registration (which the authorisation server doesn't expose to the public internet).

Other OAuth-aware MCP clients (Claude Code, Cursor, Cline) have native HTTP MCP support and can point directly at https://code-search.australiaeast.cloudapp.azure.com/mcp without the mcp-remote bridge — check each client's docs for the exact config shape.

Try it

After Claude Desktop has connected and you've logged in, paste this into a new chat:

Use the code-search find_code tool to find a code for the following clinical text,
in the context of an AU Core Condition resource:

  T2DM with diabetic retinopathy

The binding context for the resource element is
http://hl7.org.au/fhir/core/StructureDefinition/au-core-condition#Condition.code.

Show me the primary code, any intersection codes, and the reasoning.

Claude calls find_code, the service expands the abbreviation (T2DM → type 2 diabetes), recognises that "diabetic retinopathy" is a secondary clinical concept that doesn't fit in the same Condition.code, and returns it as an intersection code. You'll see Claude reflect both back to you with the SNOMED codes and a short explanation of how to model them on the FHIR resource.

The `find_code` tool

Same inputs as POST /api/v1/find-code: text, context/url, system, system_version, max_candidates, effort.

Output comes back in two parallel forms on the same MCP tool result. Clients pick whichever they prefer — there's no "mode" toggle.

Prose in content[].text — what an LLM-driven client (Claude Desktop, Cursor, Cline) reads into its context window. Compact and direct.
Structured JSON in structuredContent — the same matches[] / intersection_codes[] shape as the REST response, for programmatic clients that prefer typed data.

Example tool result for "type 2 diabetes":

// content[0].text (what the LLM reads)
Found 44054006 — Diabetes mellitus type 2 (95% confidence)

Reasoning: exact match on preferred term

// structuredContent (programmatic access)
{
  "matches": [
    {
      "code": "44054006",
      "system": "http://snomed.info/sct",
      "display": "Diabetes mellitus type 2",
      "confidence": 0.95,
      "reasoning": "exact match on preferred term"
    }
  ]
}

Example for an empty-match case:

// content[0].text
No suitable code found for "wifi triggering seizures".

// structuredContent
{ "matches": [] }

The MCP endpoint advertises its protected-resource metadata at /.well-known/oauth-protected-resource per RFC 9728 — clients that support auth discovery will pick this up automatically.

For raw ValueSet/$expand browsing without LLM evaluation, use Ontoserver's own MCP tools — this service deliberately doesn't duplicate that surface.

Error responses

Status	Body	Meaning
200	Result object	Success (matches may be empty if no plausible code exists)
400	`{"error":"validation_error","detail":...}`	Invalid request shape
401	`{"error":"Unauthorized"}`	Missing / invalid bearer token
403	`{"error":"forbidden_no_role"}`	Token is valid but doesn't grant access to this service
404	`{"error":"valueset_not_found"}`	The terminology server returned 404 for the resolved ValueSet URL — the ValueSet is genuinely unknown or unavailable. Distinct from `422 binding_not_resolved`, where the problem is the context path not carrying a binding rather than the ValueSet itself being missing.
422	`{"error":"binding_not_resolved","context":"…","cause":"…","detail":"…","suggestion":"…"}`	A profile `context` (`<StructureDefinition>#<elementPath>`) could not be resolved to a ValueSet binding. The response includes: `context` — the context string as supplied `cause` — one of: `sd_not_found` — the StructureDefinition URL could not be fetched `element_not_found` — the element path does not exist in the StructureDefinition's snapshot (common for datatype sub-elements; see below) `no_binding_at_element` — the element exists but carries no ValueSet binding `detail` — human-readable explanation `suggestion` — recommended fix Common case — datatype sub-element path. An element such as `MedicationRequest.dosageInstruction.route` is a sub-element of the `Dosage` datatype; its binding lives on the datatype definition, not on `MedicationRequest` directly. The resolver returns `cause: "element_not_found"` because that path doesn't appear in the `MedicationRequest` StructureDefinition snapshot. Fix: pass the bound ValueSet URL directly via the `url` parameter, or use an element path that carries a binding on the profile you are targeting.
504	`{"error":"upstream_error"}`	Upstream terminology server unreachable / errored

Reporting issues

Email ontoserver-support@csiro.au. Include:

Your token sub (not the token itself)
Approximate timestamp of the request
The full request body (or query string)
What you got back vs. what you expected

For unexpected codes specifically, include the FHIR context URL plus what you'd consider the correct code — that's the most useful form of feedback.