Victor Porton

Please participate in the project or at least put stars at GitHub (links above).

The project in a nutshell

Written a specification (congratulate me with great work) for automatic transformation of XML documents based on namespaces, written a software (XML Boiler) to implement these transformations. Written a short tutorial for XML Boiler. 

There was no automatic way to transform between XML files of different formats previously. This new way is the technological revolution.

Most of the specification was implemented in Python programming language, resulting in this free software.

The most important current project goal (as of Apr 2019) is to rewrite the entire project in D programming language (because Python was found too slow and also not enough reliable).

Additional project purpose: Develop some general-purpose libraries for D programming language.

The benefits of the project include:

  • freely intermix tag sets of different sets of tag semantics (using XML namespaces), without disturbing each other (such as by name clash) in the global world
  • add your new tags to HTML (and other XML-based formats)
    • get rid of using HTML in future Web, switch it to proper semantic XML formats
      • make XSL-format based browsers with automatic generation of XSL from other XML formats
    • make the automatic coloring of source listings (for example)
  • add macroses and include (such as by XInclude) other files in XML
  • intermix different XML formats, with intelligent automatic processing of the mix
    • embed one XML format in another one
    • automatically choose the order of different XML converters applied to your mixed XML file
  • make browsers to show your XML in arbitrary format
  • make processing XML intelligent (with your custom scripts)
  • integrating together XML conversion and validation scripts written in multiple programming languages
  • associating semantics (such as relations with other namespaces and validation rules) to a namespace
    • semantics can be described as an RDF resource at a namespace URL (or a related URL)
  • many more opportunities
  • integrate all of the above in the single command

Intellectual Merit

There was no automatic way to transform between XML files of different formats previously. This new way is a technological revolution. The same concerns the validation of multi-namespace XML documents.

The main author’s invention is an elaborate automatic selection of the order (with repetitions if needed) of the XML transformations apply to the transformed XML file or its parts (processing of XML file by parts is also an author’s invention).

The project’s philosophical value is assigning semantics to XML namespaces, mainly as interrelations between different namespaces, what was previously considered like something non-existing.

The specification written is a breakthrough in XML processing. Many details were needed to be resolved in a clever way when writing it.

The D language is considered by the author as the currently most advanced and perspective programming language, thus making the purpose to advance D libraries important.

Broader Impacts

The project intends to replace older means such as HTML and TeX/LaTeX, so revolutionizing such things as the Web and scientific writing.

A future HTML replacement could be done in practice in the following steps:

  1. Develop some tag sets by independent users or standardization committees. Use my software to convert XML files containing them to the usual HTML for legacy browser compatibility.
  2. Standardize news tag sets as a part of a new post-HTML standard(s).
  3. Support the new tags sets in browsers (and other HTML software such as search engines).

This would provide a smooth, multi-stage transition from legacy HTML to a new standard, what is before was impossible due to complexity of this global social process not being split into appropriate multiple stages, not allowing public participation of persons outside central standardization committees and smooth transition to new standards while the browsers are not yet updated (the chicken-egg problem: we were unable to update browsers while the standards are not yet updated and unable to update standards while the browsers are not yet updated, and also the need to do all the work in a central standardization committee not allowing enough public participation in developing a future HTML replacement).

Future TeX/LaTeX replacement could be done by developing a tag set (possibly including MathML for math formulas) for formatting complex “scientific” documents together with tag set(s) for complex Turing-complete logic (for such things as macroses and conditional processing, etc.) like one of TeX but much more convenient for the users and for transformation and data extraction software. Then this software could be optionally (partially) standardized. Unlike the above proposal of replacing HTML, for the TeX replacement, it is not strictly necessary to create a browser software supporting it, but my software (together with scripts for it by other users) could itself be used as a TeX/LaTeX replacement. Then software for data transformation/extraction for databases like Scopus could be developed.

In the future RDF documents describing the semantics of namespaces could be placed at namespace URLs (or related URLs) to support for anybody to be able to create and distribute his own XML tag sets standards and their semantics, without the need to resort to central standardization committees. (Scripts referred from such RDF documents could be executed in secure software sandboxes.)

The project files (including the documentation and study materials) are openly disseminated to the public, including by the means of Wikiversity study and research site.

The project improves Internet democracy and broader participation of underrepresented groups by allowing anyone to create his own XML standards.

The project develops free software: both specialized software for XML processing and general-purpose software libraries.

Leave a Reply

Your email address will not be published. Required fields are marked *