Skip to main content
  1. My Projects/

Hanakotoba

·164 words·1 min·



ryancahildebrandt/hanakotoba

Python
0
0


Literature in Bloom
#


Open in gitpod
This project contains 0% LLM-generated content

Purpose
#

A project to explore 花言葉 (hanakotoba, lit. flower language) in Japanese and other literary corpora.


Dataset
#

The dataset used for the current project was pulled from the following:

  • Aozora Bunko Corpus for Japanese full text works
  • Hanakotoba for flower names, translations, and associated characteristics
  • Wikipedia for conversions of Japanese decimal classification codes (分類番号)
  • Wikipedia for a list of major Japanese eras (時代)
  • This page for a list of sub-eras (元年) Some of these didn’t end up being necessary for the main project but are included with the accompanying code for genre and date conversions

Outputs
#

  • The main report, compiled with datapane and also in html format
  • Historical era dataframe : Jidai.csv
  • Sub-era dataframe : Gannen.csv
  • Japanese genre code dataframe : Genres.csv
  • Dataframe of all flowers/plants and associated characteristics : Hk_df.csv
  • Dataframe with all text metainfo, calculated date columns, and tagged flower occurences with locations in the text : All_df.csv

Ryan Hildebrandt
Author
Ryan Hildebrandt
Data Scientist, etc.