Skip to content

About PubID

The meaning of a document lives before its name is written

Origin Story

At Ribose, we build systems that manage standards documents — and we kept running into the same wall. Every publisher identifies documents differently. ISO uses ISO 9001:2015. IEEE writes IEEE Std 802.3-2018. NIST has NIST SP 800-53 Rev. 5. These identifiers carry rich semantic meaning — publisher, document type, number, year, stage, part — but that meaning is locked inside conventions that only humans can parse, and only imperfectly.

There was no universal way to:

  • Parse an identifier into its semantic components
  • Exchange identifiers between systems without loss
  • Render the same identifier in multiple formats
  • Compare identifiers across different publishers

PubID was born from a conviction that the meaning of an identifier should be separate from any single representation of it. We needed a metaschema — a shared grammar — that could capture the full semantic depth of any publication identifier and express it in whatever form the situation demands.

Since then, PubID has grown to cover 26+ publishers across international, regional, national, and industry standards, with a Ruby reference implementation that parses, renders, and interchanges identifiers with round-trip fidelity.

The Metaschema

The PubID metaschema defines the common elements that make up any publication identifier:

ElementRequiredDescription
PublisherYesThe issuing organization (ISO, IEC, IEEE, etc.)
Document TypeYesThe type of deliverable (Standard, Report, Guide, etc.)
Document NumberYesThe unique identifier number
YearOptionalPublication or revision year
PartOptionalPart number for multi-part standards
EditionOptionalEdition number
StageOptionalDevelopment stage (Draft, CD, DIS, FDIS, etc.)
LanguageOptionalLanguage code (en, fr, ru, etc.)
SupplementOptionalAmendment, Corrigendum, Addendum

Each publisher's schema specifies which elements are used, their allowed values, and how they combine syntactically.

Identifier Composition

Identifiers compose through algebraic relationships. A supplement identifier embeds a base identifier — the supplement carries its own type, number, and year, while the base remains intact inside it:

CompositionExample
BaseISO 9001:2015 — publisher, number, year
Supplement embeds BaseISO 9001:2015/Amd 1:2023 — amendment identifier contains the base
With CopublisherISO/IEC 17031-1:2020 — joint publication
CompositeISO/IEC 17031-1:2020/Amd 1:2022 — amendment embeds copublished base
3-Level NestingISO/IEC 13818-1:2015/Amd 3:2016/Cor 1:2017 — corrigendum embeds amendment, which embeds base standard
AdoptionBS EN ISO 9001:2015 — national body adopts European norm, which adopts international standard
Dual PublishedIEC 60255-24 Ed. 2.0 2013-04 and IEEE Std C37.111-2013 — two equivalent identifiers from different publishers
BundledCSA B127.1:99 + B127.2:99 — multiple standards sold as a package

Each layer is a complete identifier in its own right. A supplement identifier does not modify the base — it wraps it, adding its own type (Amendment, Corrigendum), number, and year.

Multi-Style Rendering

A key innovation in PubID is the ability to render the same identifier in multiple styles without information loss. A single identifier is parsed into structured components, then re-rendered in any output format:

Input IdentifierISO 9001:2015
Parsed Structure
International Standard
PublisherISONumber9001Year2015Edition5
Lossless Rendering
PubIDHuman-Readable
ISO 9001:2015
ShortShort
ISO 9001
MRMachine-Readable
ISO 9001:2015
URNURN
urn:iso:std:iso:9001:ed-5:en
JSONJSON
{"publisher":"ISO","number":"9001","year":2015}
All conversions are bidirectional — parse any output style and re-render in any other without information loss.

This pattern extends across all supported publishers: every PubID can render as a human-readable string, a URN, or structured JSON. Parse any style, and you can re-render it in any other — the interchange is lossless.

The Ecosystem

ComponentDescription
MetaschemaFormal definition of identifier elements and their relationships
Publisher Schemas26+ publisher-specific schema definitions
Reference LibraryRuby gem implementing all schemas with parse/render/URN support
This WebsiteDocumentation, interactive playground, and schema registry

Data pipeline: Publisher schema data on this site is exported directly from the pubid-ruby reference implementation. The export version is:

pubid-rubyv2.0.0·rt-new-lutaml-model·2026-05-04

NIST: The First Adopter

In 2021, NIST published the Publication Identifier Syntax for NIST Technical Series Publications — a formal scheme for uniquely identifying every document across the 53 publication series and 19,333+ documents in the NIST Library, dating back to 1901.

NIST was the first standards organization to adopt a multi-style, round-trippable PubID scheme with four defined rendering styles:

StyleUsageExample
FullTitle page and bibliographyNational Institute of Standards and Technology Special Publication 800-53, Revision 5
AbbreviatedAuthority sectionNatl. Inst. Stand. Technol. Spec. Publ. 800-53 Rev. 5
ShortInline citationsNIST SP 800-53 Rev. 5
Machine-ReadableDOI suffixNIST.SP.800-53r5

Ribose was involved from the conception phase alongside the NIST Information Services Office and CSRC teams in conceptualizing the universal PubID scheme — acknowledged in both the original 2020 draft and the final PubID 1.0 document. Ribose submitted formal comments during the public review period, built the nist-pubid conversion tool that migrated all 19,333 legacy identifiers, and assisted with the CSWP identifier migration when NIST moved from date-based to sequential numbering.

The NIST PubID demonstrated that a well-designed identifier scheme can serve both human and machine needs simultaneously — a principle that PubID carries forward across all publishers.

What We Believe

PubID exists to preserve the integrity of meaning across every form a publication identifier can take — human or machine, verbose or terse, printed or digital.

  • The meaning of an identifier is prior to and independent of any particular rendering
  • Human-readable and machine-readable forms are equally valid expressions of the same reality
  • Round-trip fidelity — parse any form, recover the original — is not just a feature but a philosophical commitment
  • Every identifier carries depth that no single surface representation exhausts

PubID stands for Publication Identifier. It is both a noun and a mission: to give every publication an identifier that is universally parsable, unambiguously structured, and faithful to the original meaning it represents.

The logo is not decoration — it is a statement of philosophy:

PubID Logo

道可道,非常道;名可名,非常名

The Way that can be spoken is not the eternal Way;
the name that can be named is not the eternal name.

Every element encodes a layer of meaning:

The Pool

The pool is the Way: the original, complete meaning of a publication identifier. It is the source — the semantic model from which all representations emerge and to which they return. Its depths are profound and unfathomable; meaning exists there in its purest form, beyond any particular expression, the deep structure that no single rendering can fully capture. Every identifier, in every format, points back to the same underlying reality in this pool.

The Surface

The surface of the pool is where meaning begins to take name and form. It is the boundary between the unnameable and the named — where depth rises toward expression, where semantic models crystallize into identifiable structure. A name is real and necessary, yet it is never the thing itself.

Waves & Particles

The waves (human-readable identifiers) and particles (machine-readable forms) are 有 — being, manifest existence. Every representation is real and functional, yet no single form is constant. Human-readable strings and machine-readable URNs are two modes of the same 有 — both arising from the same source, both legitimate, neither permanent.

The Space Between

The emptiness between the waves and particles is 無 — non-being, the void. It is not absence but potential: the space that gives each form its meaning. Without the gaps, there is no structure; without 無, 有 has no shape. The void is what makes the identifier parsable — the delimiters, the spaces, the structure that separates one component from another.

Open Source

PubID is proudly open source.

An open source project maintained by Ribose

Last updated:

An open source project of Ribose