I'm looking for a datasets with all the Chinese character Mandarin pronunciations in bopomofo and/or pinyin. Also, I need open source datasets that I can copy into my own code bases.
Where can I find Chinese character bopomofo/pinyin data?
2.3k views Asked by Nathan Breit At
2
There are 2 answers
0
NallaN
On
this is a bit of a late entry but I was searching for the same thing last year and ended up compiling my own character/bopomofo database based on a bunch of different data sets. I have put enough work into this thing to thoroughly call it my own though so you should check it out! its part of a rubygem I made to sort by bopomofo (I had a system that would not let me change the database colaltion settings) https://github.com/nallan/a-b-chi
Related Questions in INTERNATIONALIZATION
- Do GTK file chooser dialogs come with localized strings for buttons and titles?
- Difficulty with hosting Multilingual Wagtail site on alwaysdata
- Does the language used in schema need to match the HTML lang attribute?
- Next.js 14: using next-intl to change the language, redirects me to home page whenever I change the language
- next-i18next Problem with multiple links redirecting to the same location
- Python transliterator that holds the same rules as PHP one
- Using React Intl without a key
- Storybook build with i18n not worked
- ForceRTL on iOS isn't working in React Native App
- i18n Attaching variable in the middle of a string
- RN i18n issue on some iOS devices
- Next.js: SyntaxError: Named export 'i18n' not found
- nextjs 13, 14 App Directory language without i18n - internalization
- Django translation doesnt work in python code
- Nuxt.js i18n Not Serving Pages in Default Locale Without Redirecting
Related Questions in DATASET
- How to add a new variable to xarray.Dataset in Python with same time,lat,lon dimensions with assign?
- Power BI Automations of Audits and APIs
- Trouble understanding how to use list of String data in a Machine Learning dataset - Features expanded before making prediction
- how to difference values within several panels
- How to use an imported Excel file inside Anylogic model
- Need to be able to load different reports into the same report viewer, based on the selection of a combobox value How do i do this?
- Can i merge my custom model and pretrained model in yolov9
- How to access the whole public dataset hosted on a website?
- Use dataset name in knitr code chunk in R
- How many images should I label from the training set?
- How to get a list of numbers out of an awk output in bash
- Wrong file reading in Jupyter
- Request for Rui Li twitter dataset
- Illustrator file to single word Dataset
- Image augmentation for dataset creation
Related Questions in CHINESE-LOCALE
- Glyph errors in tick labels when using shap values to analysis my model
- Can't display Chinese characters in R plot
- Chinese character displaying wrongly/differently on Windows
- Python, how to rename file with chinese characters
- Transalting from English to Chinese on Python Django
- Bootstrap-Select Chinese Character Search in Country Dropdown(in Chinese Language)
- How to detect if Chinese text contains simplified or traditional characters?
- Is timeStyle.long and .full broken, or have I done something wrong?
- When using logging in Python, the locale is set, and the log cannot be obtained
- How to store the Chinese strings correctly?
- Strapi Navigation Internationalization: can’t save Chinese characters
- Correct display of Currency TWD
- Can `writing-mode: horizontal-tb;` be used per character?
- How to implement a specific parser (Chinese) for PostgreSQL full text search in Django?
- Messed support for Intl.DateTimeFormat with Chinese Calendar
Related Questions in OPENDATA
- I can't add datapusher-plus to my CKAN in docker
- CKAN Data Exgtraction via API
- Where can i find open medical datasets in HL7/FHIR format?
- Is there a developer-friendly open standard / ontology for personal health data?
- A SPARQL question - How can I improve this query?
- Requesting data from open data communities EPC
- Where to find data sources for music recomendation engine?
- Are nationalities' local names stored in System.Globalization?
- How can I get currency exchange rates in Snowflake? (Historic and updated daily)
- Is there an alias for using * in SQL?
- DBpedia dbo:height property and its value
- Pandas - send warnings when data's shape changes
- Customizing Website (User interface) of CKAN Data Portal
- XML files to dataframe - Brazilian Senate
- restful pagination using python elementree xml parser and loop
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
It sounds like you might be looking for the Unihan Database. The Unihan Database is maintained by the Unicode Consortium.
For an example, here is the data for 爱.
Here is the description of the organization and content of the Unihan Database. Be sure to read that to understand what the data is referring to.
If this is the information you want, you can download the ZIP archive that contains all this data.
The Unihan Database doesn't have Bopomofo (Zhuyin) pronunciations, but it has Pinyin readings. Converting from Pinyin to Zhuyin is simple; there are a lot of online tools that can do it for you.
As for licensing issues, the Unihan Database data files have a liberal copyright notice. So, you shouldn't run into any problems using that data in your own software.