Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- short-paperAugust 2018
The Causal Graph CRDT for Complex Document Structure
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 34, Pages 1–4https://doi.org/10.1145/3209280.3229110Commutative Replicated Data Types (CRDTs) are an emerging tool for real-time collaborative editing. Existing work on CRDTs mostly focuses on documents as a list of text content, but large documents (having over 7,000 pages) with complex sectional ...
- short-paperAugust 2018
Document clustering as a record linkage problem
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 39, Pages 1–4https://doi.org/10.1145/3209280.3229109This work examines document clustering as a record linkage problem, focusing on named-entities and frequent terms, using several vector and graph-based document representation methods and k-means clustering with different similarity measures. The JedAI ...
- short-paperAugust 2018
Measuring the Centrality of the References in Scientific Papers
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 44, Pages 1–4https://doi.org/10.1145/3209280.3229104Citation analysis is considered as major and one of the most popular branches of bibliometrics. Citation analysis is based on the assumption that all citations have similar values and weights each equally. Specific research fields like content-based ...
- short-paperAugust 2018
Helmholtz Principle on word embeddings for automatic document segmentation
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 40, Pages 1–4https://doi.org/10.1145/3209280.3229103Automatic document segmentation gets more and more attention in the natural language processing field. The problem is defined as text division into lexically coherent fragments. In fact, most of realistic documents are not homogeneous, so extracting ...
- short-paperAugust 2018
Automatic Term Extraction in Technical Domain using Part-of-Speech and Common-Word Features
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 51, Pages 1–4https://doi.org/10.1145/3209280.3229100Extracting key terms from technical documents allows us to write effective documentation that is specific and clear, with minimum ambiguity and confusion caused by nearly synonymous but different terms. For instance, in order to avoid confusion, the ...
- short-paperAugust 2018
Semantically Weighted Similarity Analysis for XML-based Content Components
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 20, Pages 1–4https://doi.org/10.1145/3209280.3229098Uncontrolled variants and duplicate content are ongoing problems in component content management; they decrease the overall reuse of content components. Similarity analyses can help to clean up existing databases and identify problematic texts, however, ...
- short-paperAugust 2018
diffi: diff improved; a preview
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 38, Pages 1–4https://doi.org/10.1145/3209280.3229084diffi (diff improved) is a comparison tool whose primary goal is to describe the differences between the content of two documents regardless of their formats.
diffi examines the stacks of abstraction levels of the two documents to be compared, finds ...
- research-articleAugust 2018
Automatic Rights Management for Photocopiers
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 21, Pages 1–10https://doi.org/10.1145/3209280.3209531We introduce a system to automatically manage photocopies made from copyrighted printed materials. The system monitors photocopiers to detect the copying of pages from copyrighted publications. Such activity is tallied for billing purposes. Access ...
- research-articleAugust 2018Best Paper
Choosing Math Features for BM25 Ranking with Tangent-L
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 17, Pages 1–10https://doi.org/10.1145/3209280.3209527Combining text and mathematics when searching in a corpus with extensive mathematical notation remains an open problem. Recent results for Tangent-3 on the math and text retrieval task at NTCIR-12, for example, have room for improvement, even though ...
- research-articleAugust 2018
A Market Analytics Approach to Restaurant Review Data
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 16, Pages 1–7https://doi.org/10.1145/3209280.3209524We present a novel marketing method for consumer trend detection from online user generated content, which is motivated by the gap identified in the market research literature. The existing approaches for trend analysis are generally based on rating of ...
- research-articleAugust 2018
Semantic Interoperability for Electronic Business through a Novel Cross-Context Semantic Document Exchange Approach
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018Article No.: 28, Pages 1–10https://doi.org/10.1145/3209280.3209523The E-marketplace is a common place where entities situated in different contexts conduct business electronically. Since sellers and buyers may be located in areas with different languages, customs and even business standards, business documents may be ...