Building Beautiful Data Science Blogs with Astro

You asked for it (you didn’t) and I wrote it.
Before diving into the stack and how this site runs I wanted to discuss the requirements I had in mind at the time I put this together.

  1. Ownership, principally I didn’t want my content living inside of a domain owned by someone else. I had to be able to pick it up and redeploy where I want or self host if needed.
  2. Static, there’s no need for a Blog with a bit of styling to be shipping the 3 people that read it 30MB of JS overhead to read a little text and show some images. Besides who doesn’t love a 10/10 Lighthouse.
  3. JS or Python based generation. The goal of the site was to allow me to share my thoughts with as little friction as possible, so introducing another language was out of the question.
  4. Customisation, I want to control my own CSS and styling as a first class citizen. I struggled with a few Python SSGs that treat the theme and content as two entirely different entities - it just doesn’t work for me.

What we’ll cover

This post will show you how to build and deploy a minimal site which can convert your notebooks into blogposts and deploy them automatically with master merges on github.
I’ll mostly neglect styling and any deep customisation, the goal here is to get something working you can make your own later - or explore the many themes that exist 😌.

The Stack

Taking the above into the consideration, I landed on the following:

Content

Markdown - this all exists as markdown files. This allows me to create posts directly (as this one is) as a markdown file.
Most Data Science notebook formats (Jupyter + Quarto are the two I use currently) both support markdown conversion without much hassle, allowing me to embed notebooks with ease.

Static Site Generator

Astro. While a lot more feature rich than is needed for a static blog, it also has the flexibility to customise the site to my liking without needing to fit within the confines of predefined theme layout.

Styling

Astro has great support for plugins and so picking up and adding tailwind into this build was super straightforward.

Deployment

Cloudflare Pages. Most importantly to every frugal Data Science practicioner absolutely Free 🤑.
There are plenty of options here, but being sat behind a cache layer also helps me in particular delivering the photos hosted on this site without big egress bills.

Lets Build It

Fire up your favourite definitely not AI generated youtube mix and lets copy paste some commands

Kickoff your Astro project

The Astro CLI puts together our skeleton project with most of what we need. Assuming this is your first dive into Astro I highly recommend following this guide without using a theme at first. While super powerful (and much more aesthetic then what we’ll be building) they bundle together too many advanced features that you need a groundwork first to understand if you even need.

npm create astro@latest -- --add tailwind
# once we're built
npm run dev

What goes where

For the most part of astro setup we’ll spend in the src folder. So lets breakdown what is going where.

Lets make some changes to these pages to get a better sense of this flow.
We can start by changing our Layout.astro file to the following:

// src/layouts/Layout.astro
---
const { title } = Astro.props;
---
<html lang="en">
    <head>
        <meta charset="utf-8" />
        <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
        <meta name="viewport" content="width=device-width" />
        <meta name="generator" content={Astro.generator} />
        <title>{title}</title>
    </head>

    <body class="w-screen">
        <div class="mx-auto max-w-2xl pt-5">
            <slot />
        </div>
    </body>
</html>

We should get a compilate error at this point. We’re now using the Astro.props object to allow our rendered pages to pass back some customisations which exist outside of that <slot />.

To fix this we’ll add that within our index.astro alongside simplifying it.

// src/pages/index.astro
---
import Layout from '../layouts/Layout.astro';
---

<Layout title="Homepage">
    <h1 class="text-7xl">My Blog Site</h1>
</Layout>

Adding in Some Content

Lets add a new blog folder under the src directory.
Creating the following:

[comment]: src/blog/first-post.md:
---
title: "The Blog Title"
excerpt: "A short summary of our blog"
pubDate: 2024-12-10
---
Just feel so proud, finally got a blog going after reading
one tutorial.

## An Example heading
### More heading, idk feeling lazy

```python
def oooh():
    return "a code block!"
```

The frontmatter at the top of this post is totally customisable. It allows us to fill in posts with additional metadata outside of their content. This is mostly useful for populating other parts of the site that may link to the post as well as providing filters and tags that could be used for topics.

To find this content we need to define an astro “collection” this will live within. To do this we create a content.config.ts within our src directory.
This describes the expected frontmatter on our posts in the schema, in addition to where and how to find it.

// src/content.config.ts
import { defineCollection, z } from "astro:content";
import { glob } from "astro/loaders";

const blog = defineCollection({
    loader: glob({ pattern: "*.md", base: "./src/blog" }),
    schema: z.object({
        title: z.string(),
        excerpt: z.string(),
        pubDate: z.date(),
    })
})

export const collections = { blog }

Central to how content management will work in astro, we should take a moment to consider what is going on here.
Working backwards, ultimately we need to return a collections object back from this file which can then be imported into either page templates (we’ll make one in a moment) or other components that may want to use the metadata.
Astro uses the defineCollection call to build these, taking two arguments.

Our loader here uses one of two inbuilt “content-grabbers”. glob lets us define a location to search for content. file (unused here) allows you to define a single file to parse to pull that content from. This is how the photography section of this site has been defined 🥳.
Custom loaders are also definable if you have a more complex use case.

AutoGenerating your first page

We now need to define a template page which astro can plug our content into. To do so create a new astro file at src/pages/blog/[...slug].astro.

// src/pages/blog/[...slug].astro
---
import Layout from "../../layouts/Layout.astro";
import { getCollection, render } from "astro:content";

export async function getStaticPaths() {
    const posts = await getCollection("blog");
    return posts.map((post) => ({
        params: { slug: post.id },
        props: { post },
    }));
}

const { post } = Astro.props;
const { Content } = await render(post);
---

<Layout>
    <article>
        <h1>{post.data.title}</h1>
        <Content />
    </article>
</Layout>

The slug here will simply return the name of the file, but can again be tailored to your liking. We also includes a props param containing the post content. The render function then allows us to convert the markdown into html.

You may need to restart your server running npm run astro sync so that the new type definitions for the blog can be refreshed.

Tailwind Typography

You may notice while your blog post has rendered out correctly, it doesn’t look the best out the box as we’ve not styled any content.
For a quick win the tailwind typography plugin give us a nice out the box experience.
Simply run npm install -D @tailwindcss/typography. And add the following to your tailwind.config.mjs.

// tailwind.config.mjs
/** @type {import('tailwindcss').Config} */
module.exports = {
  theme: {
    // ...
  },
  plugins: [
    require('@tailwindcss/typography'),
    // ...
  ],
}

We can then just throw the prose keyword on our article <article class="prose">.

Letting Users Find this post

This page should now be generated at localhost:$PORT/blog/my-first-post. But we still need to allow people to navigate to it.
We can start by first creating a quick Card component that’ll act as a link to click into from our index.

// src/components/Card.astro
---
const { title, excerpt, pubDate } = Astro.props;
---
<div class="shadow rounded-lg p-8 relative">
    <div class="md:flex items-center justify-between">
        <h2 class="text-2xl font-semibold leading-6 text-gray-800">{title}</h2>
        <p class="text-sm font-semibold md:mt-0 mt-4 leading-6 text-gray-800">{pubDate.toLocaleDateString("en-GB")}</p>
    </div>
    <p class="md:w-80 text-base leading-6 mt-4 text-gray-600">{excerpt}</p>
</div>

You’ll notice this just takes the frontmatter from our post.

We can then simply iterate over this in our index.astro file.
We can use the id object to give us the dynamic URL to the post location.

// src/pages/index.astro
<div class="mt-4 mx-auto max-w-4xl space-y-4">
            {
            blogPosts.map(({data, id}) => (
                    <a href=`/blog/${id}/`>
                        <Card {...data} />
                    </a>
                ))
            }
    </div>
</Layout>

Rest Here a While 🏕️

SO
This gives us a minimal wrapper moving from markdown files into blog post pages. However, there are a few Data Science specific bits of tooling to help our posts along.

Math Support

Currently any equations placed in the markdown files will render out simply as text.
Astro uses remark to parse markdown, so we’ll add the relevant math extension - in addition to the mathjax rehype plugin which renders it.

npm install rehype-mathjax remark-math

A latex-esque plugin is also available but that requires you to add a cdn link to correctly style your content.

We then just need to tell astro about these plugins.

// astro.config.mjs
// ...
import remarkMath from 'remark-math';
import rehypeMathJaxSvg from 'rehype-mathjax';

export default defineConfig({
    // ...
    markdown: {
        remarkPlugins: [remarkMath],
        rehypePlugins: [rehypeMathJaxSvg],
    }
});

If we go ahead and add some mathjax to our blog post, we should see it render.

$$
x=\frac{a}{b}
$$

Writing Posts

Finally, while writing in a straight markdown file is a reasonable experience, we really want to be able to just pipe our Jupyter notebooks straight into place with all the cell outputs we care about.
Simply create a new ipynb and then run the following:

jupyter nbconvert <my-notebook>.ipynb --to=markdown

Make sure to copy over the generated notebook file and any corresponding output images into your blog folder and it should all render as a new post 🙌.
Make sure you include the frontmatter discussed above as a single markdown cell at the top of your notebook.

Code Styling

Broadly, code styling this way works very well. Under the hood Astro uses shiki for the code syntax highlights. You can change to one of many themes by changing the selected style.
The only obvious issue I have found here is in the presentation of the cell outputs, which render as though they were simply another code block.

To address this we can add a custom transformer which conditionally modifies our plaintext cells to add some custom styling.

// astro.config.mjs
export default defineConfig({
// ...
markdown: {
    shikiConfig: {
        transformers: [
            {
                pre(hast) {
                    if (hast.properties["dataLanguage"] == "plaintext") {
                        // inject a > before the line
                        this.lines.map(line => {
                            let prepend = {
                                type: 'element',
                                tagName: 'span',
                                properties: {},
                                children: [{ type: 'text', value: '>' }]
                            }
                            line.children.unshift(prepend)
                        })

                        // remove the top margin to push to previous element
                        hast.properties["style"] = hast.properties["style"].concat("margin-top:-32px")
                    }
                    hast
                },
            },
        ]
    },
});

NOTE: At the time of writing astro has a bug where any updates to the config file do not result in the dev server refreshing. Simply delete your .astro/data-store.json file and rerun the dev server to pickup any changes!

Deployment

There are a huge range of deployment options for astro. Ultimately our final built site can be put into anything that will serve static files.
My preferenance has been for Cloudflare pages as it handles auto builds when pushing to master, in addition to caching images on the sites photography slashes.\

  1. Push your code to your remote git repository.
  2. Create an account and login to the Cloudflare dashboard and select your account in Account Home > Workers & Pages.
  3. Hit Create, then the Pages tab and then select the Connect to Git option.
  4. You’ll need to authorise Cloudflare access to your repos.
  5. Use the following build settings:
Framework preset: Astro
Build command: npm run build
Build output directory: dist

That’s it! On every push to master you should now auto trigger a new update to your site.