Feb 10, 2024

Migrate WordPress Comments to Giscus(GitHub Discussions)

On New Year's Eve, I successfully transitioned this blog from WordPress to a Static Site. I shared my thoughts and experiences behind this migration in the following post:

Migrating From WordPress to Static Site Generator

One of the biggest challenge was to migrate the existing WordPress comments. It is odd to use dynamic comments with a static site. So I decided to use 3rd party solutions -Giscus, A comments system powered by GitHub Discussions.  In this blog post, we will explore the steps taken in the comments migration process. 

Note: All comments will be migrated from the repository username, with a header saying that it is a migrated comment, the name of original author and the original comment date.

Environment

This post uses WordPress 5.7, MySQL 8.0, .NET Core 7.0 and C# on MacBook(macos12.2) for writing code.

Setup Repository

First, you need to decide which repository will host your discussions. It can be the existing or new, but it must be public (not private). I did setup a new repository for the migration.

The next step is to enable the GitHub Discussions. Go to the repository settings > General > Features, Scroll down and enable the discussion

For the migration, we need access token. You can generate it in GitHub User settings.

Go to User Settings > Developer Settings (at the end of Left sidebar)  > Expand “Personal access Tokens” > Fine-grained tokens > Click “Generate new token” button.

Enter token name, expiration, repository, give proper permission and click Generate Token button.

Copy the token, we will use it in our C# code.

Configure Giscus

Install the giscus app in the repository. Give proper permissions and Repository access.

Jump to the giscus configuration page. It provides an interactive UI for configuring your giscus script. 

Here is the information I used to generate the script:

- Enter repository name

- In Page ↔️ Discussions Mapping, select  Discussion title contains page pathname option.

- Uncheck Use strict title matching option

- Select “General” as Discussion Category

- Check Only search for discussions in this category

- Uncheck all features. I don't use them

- Select “NoBorder Light” theme to meet with this blog design

Once you've entered all these information, you'll have a script you can copy and paste into your app:


    <script src="https://giscus.app/client.js"
    data-repo="techbrij/blog-comments"
    data-repo-id="R_xxxxxxxxxxx"
    data-category="General"
    data-category-id="DIC_kw000000000000"
    data-mapping="pathname"
    data-strict="0"
    data-reactions-enabled="0"
    data-emit-metadata="0"
    data-input-position="bottom"
    data-theme="noborder_light"
    data-lang="en"
    crossorigin="anonymous"
    async>
    </script>

Copy the generated script and paste it into your HTML where the comments should be displayed.

To restrict the origins that giscus will load the discussions from my repo, add a JSON file called giscus.json to your "discussions" repository:

{
  "origins": [
    "https://techbrij.com"
  ]
}

For more information, see the advanced configuration docs.

Setup WordPress DB

You can take backup of WordPress database and restore in your local environment.

Create one more table say migration_comments

CREATE TABLE `migration_comments` (
  `comment_ID` bigint unsigned NOT NULL,
  `post_ID` bigint unsigned NOT NULL DEFAULT '0',
  `github_comment_ID` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL,
  `github_post_ID` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL,
  
  `comment_author` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `comment_date` datetime NOT NULL,
   `comment_content` mediumtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
   `comment_parent` bigint unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`comment_ID`)
) ENGINE=MyISAM AUTO_INCREMENT=233107 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Why new table?

1. I don't want change anything in the existing comments and data structure.

2. GitHub API has rate limit. It is not possible to do migration in single run in case of many comments. The application will read the records if github_comment_ID is null.
Also, We need github_post_id in the APIs. It is safe to setup a separate table for the operations.

The interesting challenge is that GitHub discussion has comments and replies (2 levels only) while WordPress can nest infinite levels in comments (i.e. replies to replies to replies….). So it was tricky to migrate deep level comment. To workaround, the nested child comments are set under top level comment. We will see in the code. 

Copy the existing approved comments into the new table

INSERT INTO `db`.`migration_comments`
(`comment_ID`,
`post_ID`,
`github_comment_ID`,
`github_post_ID`,
`comment_author`,
`comment_date`,
`comment_content`,
`comment_parent`)
(select
comment_ID,
comment_post_ID,
null,
null,
comment_author,
comment_date,
comment_content,
comment_parent
from 
`db`.`wp_comments`
where
comment_approved = 1
)

C# Code

Create a new .NET core console application

dotnet new console --framework net7.0

Add the following packages:


dotnet add package Newtonsoft.Json
dotnet add package MySql.Data
dotnet add package GraphQL.Client
dotnet add package GraphQL.Client.Serializer.Newtonsoft

Open the project in VSCode and add a new file ExportComment.cs As this is one time job, let's do all in single file for simplicity. 

public class ExportComment
{
    public static string github_token = ""; 
    public static string github_repo_id = "R_00000000"; // Repo ID from Giscus generated code
    public static string github_category_id = "DIC_kw0000000";  // General Discussion
 
}

We have already a token generated in earlier step. Also, github_repo_id and github_category_id are available in Giscus generated script. You can update all these information.

There are three things: GitHub Discussion, Top level Comments and Reply.

In the migration process, first we will create a GitHub discussion related to the WordPress post with the URL of the post as discussion name, the excerpt of the post and the link to the post as body. To get all posts having the comments from WordPress DB:

    public static DataTable GetPostsToExport(string connectionString)
    {
        string sql = @"SELECT ID, post_name as slug, post_title, post_excerpt 
FROM db.wp_posts 
where ID in (SELECT distinct post_ID FROM db.migration_comments where github_post_ID is null)";
        return RunGetQuery(connectionString, sql);
    }

The above method will return all posts having comments.

Some common methods to run SQL queries:

 public static DataTable RunGetQuery(string connectionString, string query)
    {
        using (var con = new MySqlConnection(connectionString))
        {
            string sql = query;
            using (var cmd = new MySqlCommand(sql, con))
            {
                using (var adapter = new MySqlDataAdapter(cmd))
                {
                    var resultTable = new DataTable();
                    adapter.Fill(resultTable);
                    return resultTable;
                }
            }
        }
    }
    public static int RunSaveQuery(string connectionString, string query)
    {
        using (var con = new MySqlConnection(connectionString))
        {
            string sql = query;
            con.Open();
            using (var cmd = new MySqlCommand(sql, con))
            {
                int recordsAffected = cmd.ExecuteNonQuery();
                return recordsAffected;
            }
        }
    }

To create different discussions for different posts:

    public static async Task Export(string connectionString)
    {
        using (var client = new GraphQLHttpClient("https://api.github.com/graphql", new NewtonsoftJsonSerializer()))
        {
            client.HttpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {github_token}");
            // Save Posts
            var dtPosts = GetPostsToExport(connectionString);
            var i = 0;
            foreach(DataRow row in dtPosts.Rows)
            {
                var id = (UInt64)row["ID"];
                var slug = (string) row["slug"];
                var postTitle = (string) row["post_title"];
                var postExcerpt = (string) row["post_excerpt"];
                var url = "//techbrij.com/"+ slug;
                var postBody ="### "+ $"{postTitle} \n\n  {postExcerpt} \n\n [Read full article]({url}) \n ";
                var discussionID = await CreateGitHubDiscussion(client, github_repo_id, slug, postBody, github_category_id);
                SaveGithubPostID(connectionString,id,discussionID);   
                i++;           
            }

To create GitHub discussion using GraphQL API:

 public static async Task CreateGitHubDiscussion(GraphQLHttpClient client, string repoId, string title, string body, string categoryId)
    {
        try
        {
            var query = @"
                mutation CreateDiscussion($repositoryId: ID!, $title: String!, $body: String!, $categoryId: ID!) {
                    createDiscussion(input: {
                        repositoryId: $repositoryId,
                        title: $title,
                        body: $body, 
				        categoryId: $categoryId
                    }) {
                        discussion {
                            id
                        }
                    }
                }
            ";
            var variables = new
            {
                repositoryId = repoId,
                title = title,
                body = body,
                categoryId = categoryId
            };
            var request = new GraphQLRequest { Query = query, Variables = variables };
            var response = await client.SendMutationAsync(request);
            if (response.Errors != null && response.Errors.Count() > 0)
            {
                throw new Exception($"GitHub GraphQL API error: {string.Join(", ", response.Errors.Select(e => e.Message))}");
            }
            var discussionId = response.Data.createDiscussion.discussion.id;
            return discussionId;
        }
        catch (Exception ex)
        {
            Console.WriteLine(ex.ToString());
            throw;
        }
    }

To save GitHub Post ID:

 public static void SaveGithubPostID(string connectionString, UInt64 postId, string githubPostId)
    {
        string sql = $"update db.migration_comments set github_post_ID = '{githubPostId}' where post_ID = {postId}";
        RunSaveQuery(connectionString, sql);
    }

Let's continue to Export method, the next step is to save comment.

 			// for mapping github id
            var dtCommentsAll = GetAllComments(connectionString);
            var commentGithubMap = new Dictionary();
            var commentParentMap = new Dictionary();
            foreach (DataRow row in dtCommentsAll.Rows)
            {
                var gcid = row["github_comment_ID"] == DBNull.Value ? "" : (string)row["github_comment_ID"];
                commentGithubMap.Add((UInt64)row["comment_ID"], gcid);
                commentParentMap.Add((UInt64)row["comment_ID"], (UInt64)row["comment_parent"]);
            }
            // Save Comments
            var dtComments = GetComments(connectionString);       
            i = 0;
            foreach (DataRow row in dtComments.Rows)
            {
                // var discussionID = "D_kwDOK8xcHM4AW3AS";
                var commentId = (UInt64)row["comment_ID"];
                var githubPostID = (string)row["github_post_ID"];
                var commentBody = (string)row["comment_content"];
                var commentAuthor = (string)row["comment_author"];
                var commentDate = (DateTime)row["comment_date"];
                var commentParent = (UInt64)row["comment_parent"];
                var comment = "_**" + commentAuthor + " "+ commentDate.ToString() + "** (Migrated from WordPress)_:  \n\n" + commentBody;
                var githubCommentID = "";
                if (commentParent == null || commentParent == 0)
                {
                    githubCommentID = await CreateGitHubDiscussionComment(client, githubPostID, comment);
                }
                else
                {
                    // As Github discussion 2 levels only so need to get nested comments as 2 level 
                    var parentID = commentParent;
                    while(commentParent > 0) {
                        parentID = commentParent;
                        commentParent = commentParentMap[commentParent];
                    }
                    var parentGithubId = commentGithubMap[parentID];
                    if (!string.IsNullOrEmpty(parentGithubId))
                    {
                       githubCommentID  = await CreateGitHubDiscussionComment(client, githubPostID, comment, parentGithubId);
                    }
                    else{
                        Console.WriteLine("No Parent Comment found for commentId "+ commentId);
                        continue;
                    }
                   
                }
                SaveGithubCommentID(connectionString, commentId, githubCommentID);
                commentGithubMap[commentId] = githubCommentID;
                i++;
                Console.WriteLine($"Added Comment {i}");
                                      
            }
        }
    }    
    

In above code, some mapping is used to get parent ID. As discussed earlier, WordPress comments supports infinite level of comment hierarchy while GitHub discussion has two levels. So we need to rebase deep nested comment under top level comment. that logic is implemented in the above code.

Here are some methods to get and save comments information in db.

 public static DataTable GetAllComments(string connectionString)
    {
        string sql = @"SELECT comment_ID,comment_parent, github_comment_ID FROM db.migration_comments";
        return RunGetQuery(connectionString, sql);
    }

    public static DataTable GetComments(string connectionString)
    {
        string sql = @"SELECT * FROM db.migration_comments where github_comment_ID is null and github_post_ID is not null;";
        return RunGetQuery(connectionString, sql);
    }
    
    public static void SaveGithubCommentID(string connectionString, UInt64 commentId, string githubCommentId)
    {
        string sql = $"update db.migration_comments set github_comment_ID = '{githubCommentId}' where comment_ID = {commentId}";
        RunSaveQuery(connectionString, sql);
    }

Here is the method to create GitHub discussion comment using GraphQL

    public static async Task CreateGitHubDiscussionComment(GraphQLHttpClient client, string discussionId, string body, string parentCommentId = null)
    {
        try
        {
            var query = @"
                mutation CreateDiscussionComment($discussionId: ID!, $body: String!, $replyToId: ID) {
                    addDiscussionComment(input: {
                        discussionId: $discussionId, 
						body: $body, 
						replyToId: $replyToId
                    }) { 
                        comment { id }                       
                    }
                }
            ";

            var variables = new
            {
                discussionId = discussionId,
                body = body,
                replyToId = parentCommentId
            };

            var request = new GraphQLRequest { Query = query, Variables = variables };

            var response = await client.SendMutationAsync(request);

            if (response.Errors != null && response.Errors.Count() > 0)
            {
                throw new Exception($"GitHub GraphQL API error: {string.Join(", ", response.Errors.Select(e => e.Message))}");
            }

            var commentId = response.Data.addDiscussionComment.comment.id;
            return commentId;

        }
        catch (Exception ex)
        {
            Console.WriteLine(ex.ToString());
            throw;
        }

    }

Here are some classes to parse and map GraphQL response


    public class Discussion
    {
        public string id;
    }
    public class CreateDiscussion
    {
        public Discussion discussion;
    }

    public class DiscussionRoot
    {
        public CreateDiscussion createDiscussion;
    }

    public class Comment
    {
        public string id;
    }

    public class addDiscussionComment
    {
        public Comment comment;
    }

    public class CommentRoot
    {
        public addDiscussionComment addDiscussionComment;
    }

Now you can call Export method in main method

  private static void Main(string[] args)
    {
    string connectionString = "....connections string....";
    ExportComment.Export(connectionString).GetAwaiter().GetResult();
    }

Define the connection string and run the application.

 

GitHub has API rate limit on each min and each hour. If you have many comments, probably you have to run it multiple times and probably on different timings. Good thing is that we have already saving information in separate table so the app will fetch non- processed comments to avoid duplicate creation on GitHub discussion.

In my case, there are a total 1774 comments in WordPress 225 posts.  The processing speed averaged around 120 comments per hour due to API rate limit. The entire migration process took a few days to complete.

Source Code

You can find the source code on the GitHub repository at this link.

Conclusion

This post covers the following things:

- Setup GitHub repository for the comments

- Install Giscus and generate configure script

- Setup WordPress database in local environment, add a new table for migration processing and add the existing comments in it.

- Create GitHub discussion for each WordPress post (with discussion name = post URL, body = post excerpt with the post link). C# app will read the data from MySQL database and create discussion in GitHub using GraphQL API.

- The comment header mentions the original author, date and then the content of the original comment.

- All nested replies will be put under top level comment in GitHub Discussion.

I hope, it helps!