Initial commit

2025-04-17 11:56:15 +03:00
commit 92a528b03b
28 changed files with 767 additions and 0 deletions
--- a/content/common/404.md
+++ b/content/common/404.md
@@ -0,0 +1,9 @@
+Sorry, there is no page you're looking for :(
+
+Check [main page](/en/about)
+
+---
+
+Извините, здесь нет страницы, которую вы ищете :(
+
+Загляните на [главную страницу](/ru/about)
--- a/content/common/favicon.ico
+++ b/content/common/favicon.ico
--- a/content/common/favicon.png
+++ b/content/common/favicon.png
--- a/content/common/images/01-oldsite.png
+++ b/content/common/images/01-oldsite.png
--- a/content/common/images/hex.png
+++ b/content/common/images/hex.png
--- a/content/common/index.md
+++ b/content/common/index.md
@@ -0,0 +1,3 @@
+[English](en/about)
+
+[Русский](ru/about)
--- a/content/common/robots.txt
+++ b/content/common/robots.txt
@@ -0,0 +1,14 @@
+User-agent: *
+Allow: /
+
+User-agent: GPTBot
+Disallow: /
+
+User-agent: ChatGPT-User
+Disallow: /
+
+User-agent: anthropic-ai
+Disallow: /
+
+User-agent: Google-Extended
+Disallow: /
--- a/content/en/about.md
+++ b/content/en/about.md
@@ -0,0 +1,19 @@
+# About
+
+Graduate of <a rel="noreferrer" href="https://itmo.ru" target="_blank">ITMO University</a> 
+and <a rel="noreferrer" href="https://www.spbstu.ru" target="_blank">Saint Petersburg State Polytechnic University</a>.  
+
+I like:
+- programing in C and C++
+- studying various algorithms
+- developing games (usually small prototypes)
+- foxes
+
+## Contacts
+
+Social Media:
+- <a rel="noreferrer" target="_blank" href="https://twitter.com/_blankhex_">X/Twitter</a> (`@_blankhex_`)
+- <a rel="noreferrer" target="_blank" href="https://www.reddit.com/user/_blankhex_">Reddit</a> (`u/_blankhex_`)
+- <a rel="noreferrer" target="_blank" href="https://github.com/blankhex">GitHub</a> (`blankhex`)
+
+You can also contact me via email: [me@blankhex.com](mailto:me@blankhex.com)
--- a/content/en/blog.md
+++ b/content/en/blog.md
@@ -0,0 +1,4 @@
+# Blog
+
+## 2025
+Apr 16 - [Static site generator in 90 lines of Python code](blog/2025/01-sitegen)
--- a/content/en/blog/2025/01-sitegen.md
+++ b/content/en/blog/2025/01-sitegen.md
@@ -0,0 +1,135 @@
+# Static site generator in 90 lines of Python code
+
+Created: Apr 16, 2025
+
+[На русском](/ru/blog/2025/01-sitegen)
+
+A long time ago, I made a small business card website. It consisted of three 
+simple HTML pages, one CSS file (which I generated from SCSS), several fonts 
+and images. That was more than enough to get a link to my website featured 
+in a resume or social media profile.
+
+
+![Picture of the old website](/images/01-oldsite.png "Picture of the old website")
+
+I recently decided to continue working <a href="https://github.com/blankhex/bhlib" target="_blank">on my pet-project</a> 
+and would like to publish all sorts of notes and articles on this topic on my 
+website. I didn't want to manually mess with HTML files, so I decided to look 
+for an alternative in the form of some kind of static website generator. 
+Ideally, I would like it to be:
+
+- Small and simple
+- Able to work with Markdown
+- Able syntax-highlight blocks of code
+
+Unfortunately, I couldn't find any suitable solutions for myself, so I decided
+to build my own using Python, <a href="https://mistune.lepture.com/en/latest/" target="_blank">mistune</a> 
+Markdown parser, <a href="https://jinja.palletsprojects.com/en/stable/" target="_blank">Jinja2</a> 
+template engine, and <a href="https://pygments.org" target="_blank">Pygments</a>.
+The whole generation process boils down to the following:
+
+1. For every file in the input directory check whether it is Markdown
+   - If yes - convert it to HTML (with highlighting) and write to output directory
+   - If no - copy as is to output directory
+2. Compress content of the output directory
+
+This generation process has a rather major drawback - due to the fact that 
+there is no post-processing of HTML, any links to other Markdown pages must 
+end with a `.html` extension[^1].
+
+[^1]: This can be mitigated by special web-server configuration, that replaces
+  `.md` extension with `.html` or by omitting `.md` extension entirely and 
+  using something like `try_files $uri $uri.html`
+
+Here is the code:
+
+```python
+import re, jinja2, mistune, shutil, os, pathlib, tarfile
+from pygments.lexers import get_lexer_by_name
+from pygments.formatters import HtmlFormatter
+from pygments import highlight
+
+
+class PygmentsHTMLRenderer(mistune.HTMLRenderer):
+    def block_code(self, code: str, info = None):
+        if not info:
+            return '\n<pre><code>%s</code></pre>\n' % mistune.escape(code)
+        lexer = get_lexer_by_name(info, stripall=True)
+        formatter = HtmlFormatter(lineseparator='<br>')
+        return highlight(code, lexer, formatter)
+
+
+def convert_markdown(page: str):
+    plugins = ['footnotes', 'table', 'strikethrough', 'url']
+    renderer = PygmentsHTMLRenderer(escape=False)
+    return mistune.create_markdown(plugins=plugins, renderer=renderer)(page)
+
+
+def extract_title(page: str):
+    matches = re.match('<h1>(.*?)</h1>', page)
+    if matches:
+        return matches.group(1)
+    return 'BlankHex'
+
+
+def handle_file(path: str, input_dir: str, output_dir: str, template_name: str):
+    # Calculate input and output paths
+    relpath = os.path.relpath(path, input_dir)
+    input_path = path
+    output_path = os.path.join(output_dir, relpath)
+    if input_path.endswith('.md'):
+        output_path = output_path.replace('.md', '.html')
+
+    # Don't convert if output path exists
+    if os.path.exists(output_path):
+        return
+
+    # Run conversion
+    pathlib.Path(os.path.dirname(output_path)).mkdir(parents=True, exist_ok=True)
+    if input_path.endswith('.md'):
+        # Read Markdown document
+        with open(input_path, 'r') as handle:
+            markdown_page = handle.read()
+
+        # Get Pygments styles for light and dark themes
+        light_style = HtmlFormatter(style='default').get_style_defs()
+        dark_style = HtmlFormatter(style='monokai').get_style_defs()
+
+        # Convert Markdown document to HTML document
+        html_page = convert_markdown(markdown_page)
+        html_header = extract_title(html_page)
+        environment = jinja2.Environment(loader=jinja2.FileSystemLoader('template/'))
+        template = environment.get_template(template_name)
+        output_page = template.render(title=html_header,
+                                      body=html_page,
+                                      light_style=light_style,
+                                      dark_style=dark_style)
+
+        # Write HTML document
+        with open(output_path, 'w') as handle:
+            handle.write(output_page)
+    else:
+        # Copy file as is
+        shutil.copy(path, output_path)
+
+
+def convert_dir(input_dir: str, output_dir: str, template_name: str):
+    # Convert or copy every file from the input directory to the output directory
+    for subdir, dirs, files in os.walk(input_dir):
+        for file in files:
+            handle_file(os.path.join(subdir, file), input_dir, output_dir, template_name)
+
+
+# Remove output from previous run
+if os.path.isdir('public'):
+    shutil.rmtree('public')
+if os.path.isfile('public.tgz'):
+    os.remove('public.tgz')
+
+# Run conversion
+convert_dir('content', 'public', 'template.html')
+with tarfile.open('public.tgz', 'w:gz') as tar:
+    for file in os.listdir('public'):
+        tar.add(os.path.join('public', file), file)
+```
+
--- a/content/en/projects.md
+++ b/content/en/projects.md
@@ -0,0 +1,3 @@
+# Projects
+
+As of now this page is empty :(
--- a/content/ru/about.md
+++ b/content/ru/about.md
@@ -0,0 +1,19 @@
+# Обо мне
+
+Выпускник университетов <a rel="noreferrer" href="https://itmo.ru" target="_blank">ИТМО</a> 
+и <a rel="noreferrer" href="https://www.spbstu.ru" target="_blank">СПбПУ</a>. 
+
+Я люблю:
+- программирование на C и C++
+- изучение различных алгоритмов
+- разработку игр (обычно небольшие прототипы)
+- лис
+
+## Контакты
+
+Я в соцсетях:
+- <a rel="noreferrer" target="_blank" href="https://twitter.com/_blankhex_">X/Twitter</a> (`@_blankhex_`)
+- <a rel="noreferrer" target="_blank" href="https://www.reddit.com/user/_blankhex_">Reddit</a> (`u/_blankhex_`)
+- <a rel="noreferrer" target="_blank" href="https://github.com/blankhex">GitHub</a> (`blankhex`)
+
+Также можно воспользоваться электронной почтой [me@blankhex.com](mailto:me@blankhex.com)
--- a/content/ru/blog.md
+++ b/content/ru/blog.md
@@ -0,0 +1,4 @@
+# Блог
+
+## 2025
+16 апреля - [Генератор статических сайтов в 90 строк Python кода](blog/2025/01-sitegen) 
--- a/content/ru/blog/2025/01-sitegen.md
+++ b/content/ru/blog/2025/01-sitegen.md
@@ -0,0 +1,134 @@
+# Генератор статических сайтов в 90 строк Python кода
+
+Создано: 16 апреля 2025
+
+[In English](/en/blog/2025/01-sitegen)
+
+Давным-давно я сделал небольшой сайт-визитку - он состоял из трех простеньких 
+HTML-страниц, одного CSS-файла (который я генерировал из SCSS), нескольких 
+шрифтов и картинок. Этого было более чем достаточно, чтобы ссылка на мой 
+сайт красовалась в каком-либо резюме или профиле соцсети.
+
+![Изображение старого сайта](/images/01-oldsite.png "Изображение старого сайта")
+
+Недавно я решил продолжить работу <a href="https://github.com/blankhex/bhlib" target="_blank">над своим pet проектом</a> 
+и хотел бы публиковать на своем сайте всякие заметки и статьи на эту тему.
+Мне не хотелось вручную возиться с HTML-файлам, поэтому я решил подыскать
+альтернативу в виде какого-нибудь статического генератора сайтов. В идеале, я 
+хотел бы, чтобы он был:
+
+- Небольшим и достаточно простым
+- Мог работать с Markdown
+- Мог выполнять подсветку синтаксиса в блоках кода
+
+К сожалению, я не смог найти ни одно подходящее для себя решение, поэтому
+я решил собрать свое на коленке используя Python, парсер Markdown'а <a href="https://mistune.lepture.com/en/latest/" target="_blank">mistune</a>,
+шаблонизатор <a href="https://jinja.palletsprojects.com/en/stable/" target="_blank">Jinja2</a> 
+и <a href="https://pygments.org" target="_blank">Pygments</a>. Весь процесс 
+генерации сводиться к следующему:
+
+1. Для каждого файла во входном каталоге проверяем, является ли он Markdown
+   - Если да - преобразуем его в HTML (с подсветкой синтаксиса) и записываем 
+     в выходной каталог
+   - Если нет - копируем файл как есть в выходной каталог
+2. Сжимаем содержимое выходного каталога
+
+У данного процесса генерации есть достаточно крупный недостаток - из-за того, 
+что нет постобработки HTML, любые ссылки на другие страницы Markdown должны 
+заканчиваться расширением `.html`[^1].
+
+[^1]: Эту проблему можно устранить с помощью специальной настройки веб-сервера,
+  которая заменяет в расширение `.md` на `.html` или путем автоматического 
+  дописывания расширения `.html` (к примеру: `try_files $uri $uri.html`).
+
+Сам код генератора:
+
+```python
+import re, jinja2, mistune, shutil, os, pathlib, tarfile
+from pygments.lexers import get_lexer_by_name
+from pygments.formatters import HtmlFormatter
+from pygments import highlight
+
+
+class PygmentsHTMLRenderer(mistune.HTMLRenderer):
+    def block_code(self, code: str, info = None):
+        if not info:
+            return '\n<pre><code>%s</code></pre>\n' % mistune.escape(code)
+        lexer = get_lexer_by_name(info, stripall=True)
+        formatter = HtmlFormatter(lineseparator='<br>')
+        return highlight(code, lexer, formatter)
+
+
+def convert_markdown(page: str):
+    plugins = ['footnotes', 'table', 'strikethrough', 'url']
+    renderer = PygmentsHTMLRenderer(escape=False)
+    return mistune.create_markdown(plugins=plugins, renderer=renderer)(page)
+
+
+def extract_title(page: str):
+    matches = re.match('<h1>(.*?)</h1>', page)
+    if matches:
+        return matches.group(1)
+    return 'BlankHex'
+
+
+def handle_file(path: str, input_dir: str, output_dir: str, template_name: str):
+    # Calculate input and output paths
+    relpath = os.path.relpath(path, input_dir)
+    input_path = path
+    output_path = os.path.join(output_dir, relpath)
+    if input_path.endswith('.md'):
+        output_path = output_path.replace('.md', '.html')
+
+    # Don't convert if output path exists
+    if os.path.exists(output_path):
+        return
+
+    # Run conversion
+    pathlib.Path(os.path.dirname(output_path)).mkdir(parents=True, exist_ok=True)
+    if input_path.endswith('.md'):
+        # Read Markdown document
+        with open(input_path, 'r') as handle:
+            markdown_page = handle.read()
+
+        # Get Pygments styles for light and dark themes
+        light_style = HtmlFormatter(style='default').get_style_defs()
+        dark_style = HtmlFormatter(style='monokai').get_style_defs()
+
+        # Convert Markdown document to HTML document
+        html_page = convert_markdown(markdown_page)
+        html_header = extract_title(html_page)
+        environment = jinja2.Environment(loader=jinja2.FileSystemLoader('template/'))
+        template = environment.get_template(template_name)
+        output_page = template.render(title=html_header,
+                                      body=html_page,
+                                      light_style=light_style,
+                                      dark_style=dark_style)
+
+        # Write HTML document
+        with open(output_path, 'w') as handle:
+            handle.write(output_page)
+    else:
+        # Copy file as is
+        shutil.copy(path, output_path)
+
+
+def convert_dir(input_dir: str, output_dir: str, template_name: str):
+    # Convert or copy every file from the input directory to the output directory
+    for subdir, dirs, files in os.walk(input_dir):
+        for file in files:
+            handle_file(os.path.join(subdir, file), input_dir, output_dir, template_name)
+
+
+# Remove output from previous run
+if os.path.isdir('public'):
+    shutil.rmtree('public')
+if os.path.isfile('public.tgz'):
+    os.remove('public.tgz')
+
+# Run conversion
+convert_dir('content', 'public', 'template.html')
+with tarfile.open('public.tgz', 'w:gz') as tar:
+    for file in os.listdir('public'):
+        tar.add(os.path.join('public', file), file)
+```
--- a/content/ru/projects.md
+++ b/content/ru/projects.md
@@ -0,0 +1,3 @@
+# Проекты
+
+Пока здесь ничего нет :(