2 unstable releases
0.2.1 | Nov 19, 2022 |
---|---|
0.2.0 |
|
0.1.0 | Sep 22, 2022 |
#21 in #automate
18KB
103 lines
This crate provides tools to deserialize structs from XML; most notably, it provides a [derive macro][derive@DeserializeXml] to automate that process (by implementing DeserializeXml
for you).
Note: the implementation is highly limited and inelegant. I wrote this purely to help power a feed reader I'm working on as a personal project; don't expect anything "production-ready..." (See the caveats below.)
Examples
Basic
Here's how you could use this crate to easily parse a very simple XML structure:
use deserialize_xml::DeserializeXml;
#[derive(Default, Debug, DeserializeXml)]
struct StringOnly {
title: String,
author: String,
}
let input = "<stringonly><title>Title</title><author>Author</author></stringonly>";
// `from_str` here was provided by `#[derive(DeserializeXml)]` above
let result = StringOnly::from_str(input).unwrap();
assert_eq!(result.title, "Title");
assert_eq!(result.author, "Author");
Advanced
This example shows more advanced functionality:
use deserialize_xml::DeserializeXml;
#[derive(Default, Debug, DeserializeXml)]
// This attribute indicates we should parse this struct upon encountering an <item> tag
#[deserialize_xml(tag = "item")]
struct StringOnly {
title: String,
author: String,
}
#[derive(Default, Debug, DeserializeXml)]
struct Channel {
title: String,
// This allows us to use an idiomatic name for the
// struct member instead of the raw tag name
#[deserialize_xml(tag = "lastUpdated")]
last_updated: String,
ttl: u32,
// (unfortunately, we need to repeat `tag = "item"` here for now)
#[deserialize_xml(tag = "item")]
entries: Vec<StringOnly>,
}
let input = r#"<channel>
<title>test channel please ignore</title>
<lastUpdated>2022-09-22</lastUpdated>
<ttl>3600</ttl>
<item><title>Article 1</title><author>Guy</author></item>
<item><title>Article 2</title><author>Dudette</author></item>
</channel>"#;
let result = Channel::from_str(input).unwrap();
assert_eq!(result.title, "test channel please ignore");
assert_eq!(result.last_updated, "2022-09-22");
assert_eq!(result.ttl, 3600);
assert_eq!(result.entries.len(), 2);
assert_eq!(result.entries[0].title, "Article 1");
assert_eq!(result.entries[0].author, "Guy");
assert_eq!(result.entries[1].title, "Article 2");
assert_eq!(result.entries[1].author, "Dudette");
Caveats
-
The support for
Vec<T>
/Option<T>
is very limited at the moment. Namely, the macro performs a textual check to see if the member type is, e.g.,Vec<T>
; if so, it creates an empty vec and pushes the results ofDeserializeXml::from_reader
for the inner type (T
) when it encounters the matching tag. Note the emphasis on textual check: the macro will fail if you "spell"Vec<T>
differently (e.g., by aliasing it), or use your own container type. (The same limitations apply forOption<T>
.) -
The macro only supports structs.
-
An implementation of
DeserializeXml
is provided forString
s and numeric types (i.e.u8
,i8
, ...). To add support for your own type, see this section. -
Struct fields of type
Option<T>
, whereT
is also a struct to which#[derive(DeserializeXml)]
has been applied, are seemingly skipped during parsing unless thetag
attribute is set correctly. (This might also arise in other edge cases, but this one is instructive.) This is easiest to illustrate with an example:
use deserialize_xml::DeserializeXml;
#[derive(Default, Debug, DeserializeXml)]
struct Post {
title: String,
// The inner type has a weird name, but the generated parser uses the field name
// by default, so it will look for <attachment> tags--all good, or so you think...
attachment: Option<WeirdName>,
};
#[derive(Default, Debug, DeserializeXml)]
#[deserialize_xml(tag = "attachment")] // (*) - necessary!
struct WeirdName {
path: String,
mime_type: String,
}
let input = r#"<post>
<title>A Modest Proposal</title>
<attachment>
<path>./proposal_banner.jpg</path>
<mime_type>image/jpeg</mime_type>
</attachment>
</post>"#;
// So far, this looks like a very standard example...
let result = Post::from_str(input).unwrap();
assert_eq!(result.title, "A Modest Proposal");
// ..but without the line marked (*) above, result.attachment is None!
let attachment = result.attachment.unwrap();
assert_eq!(attachment.path, "./proposal_banner.jpg");
assert_eq!(attachment.mime_type, "image/jpeg");
Without line (*)
, what goes wrong? Post::from_reader
(which is
called by Post::from_str
) will look for <attachment>
tags and
dutifully call WeirdName::from_reader
when it sees one. However,
WeirdName::from_reader
has no knowledge that someone else is
referring to it as attachment
, so the body of that implementation assumes it should only parse
<weirdname>
tags. Since it won't find any, we won't parse our <attachment>
. By adding the
#[deserialize_xml(tag = "attachment")]
attribute to WeirdName
, we ensure that the implementation
of WeirdName::from_reader
instead looks for <attachment>
tags,
not <weirdname>
tags. Unfortunately, at the moment there is no convenient way to associate
WeirdName
with multiple tags.
Implementing DeserializeXml
for your own struct
Of course, you can implement DeserializeXml
yourself from scratch, but doing so tends to
involve frequently repeating some boilerplate XML parser manipulation code. Instead, see the
documentation and implementation of impl_deserialize_xml_helper
for a more ergonomic way of
handling the common case.
Dependencies
~2MB
~42K SLoC