#parse #dataset #parser #fs-file #stack-exchange

se_dump

Some structs to facilitate parsing of StackExchange dumps into easy-to-use values

1 unstable release

0.1.0 Jan 6, 2023

#47 in #fs-file

MIT/Apache

9KB
146 lines

Parse Stack Exchange Dumps

Some structs to facilitate parsing of StackExchange dumps into easy-to-use values.

use std::fs::File;
use std::io::BufReader;
use std::path::Path;
use quick_xml::de::from_reader;
use se_dump::post::{PostId, Posts, PostType};

let reader = BufReader::new(File::open(Path::new("sample_data/Posts.xml")).unwrap());
let posts: Posts = from_reader(reader).unwrap();
assert_eq!(posts.posts[0].id, PostId(2115));
assert_eq!(posts.posts[0].post_type, PostType::Answer);

Incomplete

Currently, only Post and PostLink structs are provided22

Dependencies

~1.3–2MB
~39K SLoC