Homework 2: Code2HTML

Create a Code2HTML utility.

This homework is worth 30 points.

Goals

When you finish this homework you should:

Assignment

When typing notes, I frequently create code to support those notes. I like to include this code within the notes but find it tedious to reformat the code so that it appears without errors in html or my editor when in html mode.

The two major problems are that & and <. Each of these has a special meaning in both C/C++ and HTML. When importing code, the following substitutions need to be made

In addition, my editor would be much happier if > were changed to &gt;, otherwise the color highlighting becomes confused.

A second matter is the heading. I use the package code prettifier located here. To use this package the html file must include:

   <script src="https://cdn.rawgit.com/google/code-prettify/master/loader/run_prettify.js"></script>
In the file. In addition code must be wrapped in a <pre class="prettyprint"> ... </pre> block.

Your assignment is to write a program which automatically performs these tasks. For example, the file foo.C without formatting would appear as:

#include 

using namespace std;

void PrintValue ( int * value );

int main ( int argv, char * argc[]) {
    int i = 3;

    PrintValue ( &i );

    cout << "All Done " << endl;

    return 0;
}

void PrintValue ( int * value ) {
    cout << "The value of I is " << *value << endl;
    return;
}
Note: some items, such as the stream insertion operator and reference work just fine, but the #include has been mangled. In addition, my editor is very unhappy, take a look at this screen shot:

You are to create a program Code2HTML which will take the file, such as foo.C and produce the file such as foo.html.


<script src="https://cdn.rawgit.com/google/code-prettify/master/loader/run_prettify.js"></script>
<pre class="prettyprint">

#include &lt;iostream&gt;

using namespace std;

void PrintValue ( int * value );

int main ( int argv, char * argc[]) {
    int i = 3;

    PrintValue ( &amp;i );

    cout &lt;&lt; "All Done " &lt;&lt; endl;

    return 0;
}

void PrintValue ( int * value ) {
    cout &lt;&lt; "The value of I is " &lt;&lt; *value &lt;&lt; endl;
    return;
}
</pre>

By default Code2HTML should read from the standard input, and write to the standard output. It should

The command Code2HTML < foo.C > foo.html should create the file foo.html as given.

Thinking about the default behavior, I have decided there are a few additional functions that I might take advantage of from time to time. For example, I might want to mark the variable argc in bold, or int in italics. It would be convenient to be able to mark cout as a link to http://www.cplusplus.com/reference/iostream/cout or even replace the variable i with a different name such as theNumber. Finally, I might want to be able to specifiy either an input or output file name.

To accomplish this, your program should support the following command line options

You are permitted to use the stl, I would possibly investigate using a vector to store the various lists of words you need to manipulate. A map might make sense, but I think it would be overkill. If you want help with the stl, please speak to me.

You are not permitted to use sed, awk or any other file editing tool.

I will not try to make it overly complex. For example, I will only apply one task to a word. You program should, in theory, be able to support -b hello -i hello -r hello "hi there", but I will not test this.

Submission

The code must be submitted to your instructor via email by the class time on the due date. The file should be submitted as a .tgz file, which is produced using the following process: I understand that you may not have done this in the past, but given the above instructions, plus the ability to ask questions, you should be able to accomplish this task. If you don't understand, or wish for further assistance, please ask.

All parts of this document are part of the assignment. Failure to submit code in the requested form will result in a deduction of points. Failure to provide the requested files will as well. Part of the world of systems administration is to be able to learn to perform new and different tasks. If it can be automated, someone will do this, and no one will pay you to accomplish such tasks. Embrace learning something new.

From Wikipedia:

The tar format continues to be used extensively for open-source software 
distribution. Linux  versions use features prominently in various software
distributions, with most software source code made available in gzip
compressed tar archives (.tar.gz file suffix).