A Ruby gem for working with the ResourceSync web synchronization framework.
It consists of the following:
- Classes corresponding to the major document types defined in the ResourceSync specification, such as Resource Lists, Change Lists, Source Descriptions and so on. Each of these classes has a
load_from_xml
method that can parse the corresponding XML document (as anREXML::Element
), and asave_to_xml
method that can serialize an instance of that class to XML (as anREXML::Element
). - Classes for the major sub-structures of those documents, such as the
<url>
and<sitemap>
tags (subsumed under the Resource class) defined by the Sitemap specification, as well as the ResourceSync-specific<rs:ln>
and<rs:md>
tags (the Link and Metadata classes, respectively). - An XMLParser class that can take a ResourceSync-augmented Sitemap document (in the form of an
REXML::Element
, anREXML::Document
, a string, anIO
, or something sufficientlyIO
-like thatREXML::Document
can parse it) and produce an instance of the appropriate class based on thecapability
attribute in the root element's metadata.
require 'resync'
data = File.read('my-capability-list.xml')
capability_list = Resync::XMLParser.parse(data)
require 'resync'
change_list = Resync::ChangeList.new(
links: [ Resync::Link.new(rel: 'up', href: 'http://example.com/my-dataset/my-capability-list.xml') ],
metadata: Resync::Metadata.new(
capability: 'changelist',
from_time: Time.utc(2013, 1, 3)
)
resources: [
# ... generate list of changes here ...
]
)
xml = change_list.save_to_xml
formatter = REXML::Formatters::Pretty.new
formatter.write(xml, $stdout)
resync-client, a Ruby client library for ResourceSync.
This is a work in progress. Bug reports and feature requests are welcome (particularly on the document creation side, which our use cases haven't really explored).
There are certain well-specified relationships between elements: most document types should always have a link with an up
relationship, many resources should have metadata with a defined capability
attribute, and so on. In some cases there are convenience getters for these attributes on the 'parent' object (e.g. you can ask for the capability
directly without violating the law of Demeter), but there generally aren't corresponding convenience setters, or convenience initializer parameters.
Document types (ChangeList
, ResourceList
, etc.) will create a Metadata
with the appropriate capability for themselves if none is specified, but if they're initialized with one that doesn't declare a capability, they'll raise an exception rather than fill it in (just as they'll raise an exception if the wrong capability is specified).
A ChangeList
should contain only resources with Metadata
declaring a change
type. The resources in a ResourceDumpManifest
should each declare a path
indicating their locations in the ZIP file. resync
doesn't currently do anything to enforce, validate, or assist in compliance with these and similar restrictions.
(An exception: document types will complain if initialized with Metadata
having the wrong capability.)
The required/forbidden time attributes defined in Appendix A,
"Time Attribute Requirements",
of the ResourceSync specification are not enforced; it's possible to
create, e.g., a ResourceList
with a from_time
on its metadata, or a ChangeList
with members whose metadata does not declare a modified_time
, even though both scenarios are forbidden by the specification.
The ResourceSync schema defines restrictions on the values of several attributes:
- Path values must start with a slash, must not end with a slash
- Priorities must be positive and < 1,000,000
- Link relation types must conform with RFC 5988
The Sitemap and Sitemap index schemas also define some restrictions:
- URIs have a minimum length of 12 and a max of 2048 characters.
- Priorities must be in the range 0.0-1.0 (inclusive)
None of these restrictions are currently enforced by resync
.
When reading a ResourceSync document from XML and writing it back out, <rs:ln>
elements will always appear before <rs:md>
elements, regardless of their order in the original source.
The XML::Mapping library resync
uses doesn't support namespaces, so namespace handling in resync
is a bit hacky. In particular, you may see strange behavior when using <rs:ln>
, <rs:md>
, <url>
, or <sitemap>
tags outside the context of a <urlset>
/<sitemapindex>
.