1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
|
# Static site generator in 90 lines of Python code
Created: Apr 16, 2025
[На русском](/ru/blog/2025/01-sitegen)
A long time ago, I made a small business card website. It consisted of three
simple HTML pages, one CSS file (which I generated from SCSS), several fonts
and images. That was more than enough to get a link to my website featured
in a resume or social media profile.

I recently decided to continue working <a href="https://github.com/blankhex/bhlib" target="_blank">on my pet-project</a>
and would like to publish all sorts of notes and articles on this topic on my
website. I didn't want to manually mess with HTML files, so I decided to look
for an alternative in the form of some kind of static website generator.
Ideally, I would like it to be:
- Small and simple
- Able to work with Markdown
- Able syntax-highlight blocks of code
Unfortunately, I couldn't find any suitable solutions for myself, so I decided
to build my own using Python, <a href="https://mistune.lepture.com/en/latest/" target="_blank">mistune</a>
Markdown parser, <a href="https://jinja.palletsprojects.com/en/stable/" target="_blank">Jinja2</a>
template engine, and <a href="https://pygments.org" target="_blank">Pygments</a>.
The whole generation process boils down to the following:
1. For every file in the input directory check whether it is Markdown
- If yes - convert it to HTML (with highlighting) and write to output directory
- If no - copy as is to output directory
2. Compress content of the output directory
This generation process has a rather major drawback - due to the fact that
there is no post-processing of HTML, any links to other Markdown pages must
end with a `.html` extension[^1].
[^1]: This can be mitigated by special web-server configuration, that replaces
`.md` extension with `.html` or by omitting `.md` extension entirely and
using something like `try_files $uri $uri.html`
Here is the code:
```python
import re, jinja2, mistune, shutil, os, pathlib, tarfile
from pygments.lexers import get_lexer_by_name
from pygments.formatters import HtmlFormatter
from pygments import highlight
class PygmentsHTMLRenderer(mistune.HTMLRenderer):
def block_code(self, code: str, info = None):
if not info:
return '\n<pre><code>%s</code></pre>\n' % mistune.escape(code)
lexer = get_lexer_by_name(info, stripall=True)
formatter = HtmlFormatter(lineseparator='<br>')
return highlight(code, lexer, formatter)
def convert_markdown(page: str):
plugins = ['footnotes', 'table', 'strikethrough', 'url']
renderer = PygmentsHTMLRenderer(escape=False)
return mistune.create_markdown(plugins=plugins, renderer=renderer)(page)
def extract_title(page: str):
matches = re.match('<h1>(.*?)</h1>', page)
if matches:
return matches.group(1)
return 'BlankHex'
def handle_file(path: str, input_dir: str, output_dir: str, template_name: str):
# Calculate input and output paths
relpath = os.path.relpath(path, input_dir)
input_path = path
output_path = os.path.join(output_dir, relpath)
if input_path.endswith('.md'):
output_path = output_path.replace('.md', '.html')
# Don't convert if output path exists
if os.path.exists(output_path):
return
# Run conversion
pathlib.Path(os.path.dirname(output_path)).mkdir(parents=True, exist_ok=True)
if input_path.endswith('.md'):
# Read Markdown document
with open(input_path, 'r') as handle:
markdown_page = handle.read()
# Get Pygments styles for light and dark themes
light_style = HtmlFormatter(style='default').get_style_defs()
dark_style = HtmlFormatter(style='monokai').get_style_defs()
# Convert Markdown document to HTML document
html_page = convert_markdown(markdown_page)
html_header = extract_title(html_page)
environment = jinja2.Environment(loader=jinja2.FileSystemLoader('template/'))
template = environment.get_template(template_name)
output_page = template.render(title=html_header,
body=html_page,
light_style=light_style,
dark_style=dark_style)
# Write HTML document
with open(output_path, 'w') as handle:
handle.write(output_page)
else:
# Copy file as is
shutil.copy(path, output_path)
def convert_dir(input_dir: str, output_dir: str, template_name: str):
# Convert or copy every file from the input directory to the output directory
for subdir, dirs, files in os.walk(input_dir):
for file in files:
handle_file(os.path.join(subdir, file), input_dir, output_dir, template_name)
# Remove output from previous run
if os.path.isdir('public'):
shutil.rmtree('public')
if os.path.isfile('public.tgz'):
os.remove('public.tgz')
# Run conversion
convert_dir('content', 'public', 'template.html')
with tarfile.open('public.tgz', 'w:gz') as tar:
for file in os.listdir('public'):
tar.add(os.path.join('public', file), file)
```
|