penelope 다운로드 - penelope 소스 코드 다운로드

penelope

기타 소스코드

v3.1.3

다운로드

페넬로페

Penelope 는 특히 eReader 장치용 사전을 생성, 편집 및 변환하기 위한 다중 도구입니다.

버전: 3.1.3
날짜: 2016-09-23
개발자: 알베르토 페타린
라이센스: MIT 라이센스(MIT)
연락처: 여기를 클릭하세요

현재 버전에서는 다음을 수행할 수 있습니다.

사전을 다음 형식으로 변환하거나 다음 형식으로 변환합니다.
- 부킨 사이북 오디세이(R/W)
- CSV(읽기/쓰기)
- EPUB(W만 해당)
- MOBI(Kindle, W만 해당)
- Kobo(R 인덱스만 해당, W 암호화되지 않음/난독화되지 않은 경우만 해당)
- 스타딕트(R/W)
- XML(R/W)
동일한 유형의 여러 사전을 단일 사전으로 병합
동일한 표제에 대한 여러 정의를 병합합니다.
표제어 및/또는 정의별로 정렬
정의를 병합/정렬/편집하기 위해 자신만의 입력 파서를 정의합니다.
자신만의 데이터 정렬 기능 정의( bookeen 출력 형식만 해당)
사전이 포함된 EPUB 파일 출력(예: eReader의 검색 기능 부족에 대처하기 위해)
MOBI(Kindle) 사전 출력

중요 업데이트

2016-04-17 슬프게도 다른 FLOSS 프로젝트에 FLOSS 시간의 100%가 소요되고 여전히 집세와 청구서를 지불하고 가족 및 친구와 시간을 보내야 하기 때문에 더 이상 Penelope 작업에 시간을 보낼 여유가 없습니다. ., 다른 누구처럼. 그러므로 나는 문제나 끌어오기 요청에 대해 작업하지 않을 것이며, 그것이 전혀 처리될 것이라고 기대하지 마십시오 . 나는 이 프로젝트를 맡을 다른 개발자를 적극적으로 찾고 있습니다. (전환이 발생하면 이 알림은 제거되어야 합니다.) 사전을 변환해야 하고 현재 버전의 Penelope가 작동하지 않는 경우 PyGlossary 를 살펴보는 것이 좋습니다. 불편을 끼쳐드려 진심으로 사과드립니다.

설치

핍 사용

콘솔을 열고 다음을 입력하십시오.
```
$ [sudo] pip install penelope
```
그게 다야! 설명서를 얻으려면 인수 없이(또는 -h 또는 --help 와 함께) 실행하세요.
```
$ penelope
```

이 절차에서는 lxml 및 marisa-trie 설치합니다. dictzip (StarDict 출력)과 kindlegen (MOBI 출력)을 별도로 설치해야 할 수도 있습니다. 아래를 참조하세요.

소스 코드에서

소스 코드를 얻으세요:
- git 으로 이 저장소를 복제하세요.
```
$ git clone https://github.com/pettarin/penelope.git
```
- 또는 최신 릴리스를 다운로드하여 어딘가에 압축을 풀거나,
- 또는 현재 마스터 ZIP을 다운로드하여 어딘가에 압축을 풀어보세요.
콘솔을 열고 penelope (복제된) 디렉터리를 입력합니다.
```
$ cd /path/to/penelope
```
그게 다야! 설명서를 얻으려면 인수 없이(또는 -h 또는 --help 와 함께) 실행하세요.
```
$ python -m penelope
```

이 절차에서는 종속성을 설치하지 않습니다. 이 작업은 수동으로 수행해야 합니다. 아래를 참조하세요.

종속성

Python, 버전 2.7.x 또는 3.4.x(또는 그 이상)
StarDict 사전을 작성하려면: $PATH 에서 사용 가능하거나 --dictzip-path 로 지정된 dictzip 실행 파일:
```
$ [sudo] apt-get install dictzip
```
Kobo 사전 읽기/쓰기: Python 모듈 marisa-trie :
```
$ [sudo] pip install marisa-trie
```
또는 $PATH 에서 사용 가능하거나 --marisa-bin-path 로 지정된 MARISA 실행 파일
MOBI Kindle 사전 작성: $PATH 에서 사용 가능하거나 --kindlegen-path 로 지정되는 kindlegen 실행 파일
XML 사전을 읽고 쓰려면: Python 모듈 lxml :
```
$ [sudo] pip install lxml
```

용법

 usage: 
  $ penelope -h
  $ penelope -i INPUT_FILE -j INPUT_FORMAT -f LANGUAGE_FROM -t LANGUAGE_TO -p OUTPUT_FORMAT -o OUTPUT_FILE [OPTIONS]
  $ penelope -i IN1,IN2[,IN3...] -j INPUT_FORMAT -f LANGUAGE_FROM -t LANGUAGE_TO -p OUTPUT_FORMAT -o OUTPUT_FILE [OPTIONS]

description:
  Convert dictionary file(s) with file name prefix INPUT_FILE from format INPUT_FORMAT to format OUTPUT_FORMAT, saving it as OUTPUT_FILE.
  The dictionary is from LANGUAGE_FROM to LANGUAGE_TO, possibly the same.
  You can merge several dictionaries (with the same format), by providing a list of comma-separated prefixes, as shown by the third synopsis above.

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           enable debug mode (default: False)
  -f LANGUAGE_FROM, --language-from LANGUAGE_FROM
                        from language (ISO 639-1 code)
  -i INPUT_FILE, --input-file INPUT_FILE
                        input file name prefix(es). Multiple prefixes must be
                        comma-separated.
  -j INPUT_FORMAT, --input-format INPUT_FORMAT
                        from format (values: bookeen|csv|kobo|stardict|xml)
  -k, --keep            keep temporary files (default: False)
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        output file name
  -p OUTPUT_FORMAT, --output-format OUTPUT_FORMAT
                        to format (values:
                        bookeen|csv|epub|kobo|mobi|stardict|xml)
  -t LANGUAGE_TO, --language-to LANGUAGE_TO
                        to language (ISO 639-1 code)
  -v, --version         print version and exit
  --author AUTHOR       author string
  --copyright COPYRIGHT
                        copyright string
  --cover-path COVER_PATH
                        path of the cover image file
  --description DESCRIPTION
                        description string
  --email EMAIL         email string
  --identifier IDENTIFIER
                        identifier string
  --license LICENSE     license string
  --title TITLE         title string
  --website WEBSITE     website string
  --year YEAR           year string
  --apply-css APPLY_CSS
                        apply the given CSS file (epub and mobi output only)
  --bookeen-collation-function BOOKEEN_COLLATION_FUNCTION
                        use the specified collation function
  --bookeen-install-file
                        create *.install file (default: False)
  --csv-fs CSV_FS       CSV field separator (default: ',')
  --csv-ignore-first-line
                        ignore the first line of the input CSV file(s)
                        (default: False)
  --csv-ls CSV_LS       CSV line separator (default: 'n')
  --dictzip-path DICTZIP_PATH
                        path to dictzip executable
  --epub-no-compress    do not create the compressed container (epub output
                        only, default: False)
  --escape-strings      escape HTML strings (default: False)
  --flatten-synonyms    flatten synonyms, creating a new entry with
                        headword=synonym and using the definition of the
                        original headword (default: False)
  --group-by-prefix-function GROUP_BY_PREFIX_FUNCTION
                        compute the prefix of headwords using the given prefix
                        function file
  --group-by-prefix-length GROUP_BY_PREFIX_LENGTH
                        group headwords by prefix of given length (default: 2)
  --group-by-prefix-merge-across-first
                        merge headword groups even when the first character
                        changes (default: False)
  --group-by-prefix-merge-min-size GROUP_BY_PREFIX_MERGE_MIN_SIZE
                        merge headword groups until the given minimum number
                        of headwords is reached (default: 0, meaning no merge
                        will take place)
  --ignore-case         ignore headword case, all headwords will be lowercased
                        (default: False)
  --ignore-synonyms     ignore synonyms, not reading/writing them if present
                        (default: False)
  --include-index-page  include an index page (epub and mobi output only,
                        default: False)
  --input-file-encoding INPUT_FILE_ENCODING
                        use the specified encoding for reading the raw
                        contents of input file(s) (default: 'utf-8')
  --input-parser INPUT_PARSER
                        use the specified parser function after reading the
                        raw contents of input file(s)
  --kindlegen-path KINDLEGEN_PATH
                        path to kindlegen executable
  --marisa-bin-path MARISA_BIN_PATH
                        path to MARISA bin directory
  --marisa-index-size MARISA_INDEX_SIZE
                        maximum size of the MARISA index (default: 1000000)
  --merge-definitions   merge definitions for the same headword (default:
                        False)
  --merge-separator MERGE_SEPARATOR
                        add this string between merged definitions (default: '
                        | ')
  --mobi-no-kindlegen   do not run kindlegen, keep .opf and .html files
                        (default: False)
  --no-definitions      do not output definitions for EPUB and MOBI formats
                        (default: False)
  --sd-ignore-sametypesequence
                        ignore the value of sametypesequence in StarDict .ifo
                        files (default: False)
  --sd-no-dictzip       do not compress the .dict file in StarDict files
                        (default: False)
  --sort-after          sort after merging/flattening (default: False)
  --sort-before         sort before merging/flattening (default: False)
  --sort-by-definition  sort by definition (default: False)
  --sort-by-headword    sort by headword (default: False)
  --sort-ignore-case    ignore case when sorting (default: False)
  --sort-reverse        reverse the sort order (default: False)

examples:

  $ penelope -i dict.csv -j csv -f en -t it -p stardict -o output.zip
    Convert en->it dictionary dict.csv (in CSV format) into output.zip (in StarDict format)

  $ penelope -i dict.csv -j csv -f en -t it -p stardict -o output.zip --merge-definitions
    As above, but also merge definitions

  $ penelope -i d1,d2,d3 -j csv -f en -t it -p csv -o output.csv --sort-after --sort-by-headword
    Merge CSV dictionaries d1, d2, and d3 into output.csv, sorting by headword

  $ penelope -i d1,d2,d3 -j csv -f en -t it -p csv -o output.csv --sort-after --sort-by-headword --sort-ignore-case
    As above, but ignore case for sorting

  $ penelope -i d1,d2,d3 -j csv -f en -t it -p csv -o output.csv --sort-after --sort-by-headword --sort-reverse
    As above, but reverse the order

  $ penelope -i dict.zip -j stardict -f en -t it -p csv -o output.csv
    Convert en->it dictionary dict.zip (in StarDict format) into output.csv (in CSV format)

  $ penelope -i dict.zip -j stardict -f en -t it -p csv -o output.csv --ignore-synonyms
    As above, but do not read the .syn synonym file if present

  $ penelope -i dict.zip -j stardict -f en -t it -p csv -o output.csv --flatten-synonyms
    As above, but flatten synonyms

  $ penelope -i dict.zip -j stardict -f en -t it -p bookeen -o output
    Convert dict.zip into output.dict.idx and output.dict for Bookeen devices

  $ penelope -i dict.zip -j stardict -f en -t it -p kobo -o dicthtml-en-it
    Convert dict.zip into dicthtml-en-it.zip for Kobo devices

  $ penelope -i dict.csv -j csv -f en -t it -p mobi -o output.mobi --cover-path mycover.png --title "My English->Italian Dictionary"
    Convert dict.csv into a MOBI (Kindle) dictionary, using the specified cover image and title

  $ penelope -i dict.xml -j xml -f en -t it -p mobi -o output.epub
    Convert dict.xml into an EPUB dictionary

  $ penelope -i dict.xml -j xml -f en -t it -p mobi -o output.epub --epub-output-definitions
    As above, but also output definitions

여기에서 ISO 639-1 언어 코드를 찾을 수 있습니다.

사전 설치

Bookeen Odyssey 장치

예를 들어 IT -> EN 사전을 사용한다고 가정해 보겠습니다.

PC에서 IT -> EN 사전 파일 it-en.dict 및 it-en.dict.idx 생성/다운로드합니다.
USB 케이블을 통해 Odyssey 장치를 PC에 연결하세요.
파일 관리자를 사용하여 PC에서 it-en.dict 및 it-en.dict.idx 두 파일을 Odyssey 장치의 Dictionaries/ 디렉터리로 복사합니다.
Odyssey를 재부팅하고 이탈리아어로 된 책을 열고 단어를 선택하면 영어로 된 정의가 나타납니다. (이 테스트에서는 일반적인 단어를 선택하여 해당 단어가 사전에 있는지 확인하세요!)

Bookeen 사전 소프트웨어는 eBook의 dc:language 메타데이터를 읽어 사용할 사전을 선택합니다. eBook에 적절한 dc:language 메타데이터가 있는지 확인하세요. 그렇지 않으면 올바른 사전이 로드되지 않을 수 있습니다.

공방 장치

이 글을 쓰는 시점(2016-02-16)에 Kobo 장치는 파일에 다음과 같은 공식 Kobo 사전의 파일 이름이 있는 경우에만 사전을 로드합니다.

dicthtml.zip (EN)
dicthtml-de.zip (DE), dicthtml-de-en.zip (DE -> EN), dicthtml-en-de.zip (EN -> DE),
dicthtml-es.zip (ES), dicthtml-es-en.zip (ES -> EN), dicthtml-en-es.zip (EN -> ES),
dicthtml-fr.zip (프랑스), dicthtml-fr-en.zip (프랑스 -> EN), dicthtml-en-fr.zip (EN -> FR),
dicthtml-it.zip (IT), dicthtml-it-en.zip (IT -> EN), dicthtml-en-it.zip (EN -> IT),
dicthtml-nl.zip (NL)
dicthtml-ja.zip (JA), dicthtml-en-ja.zip (EN -> JA),
dicthtml-pt.zip (PT), dicthtml-pt-en.zip (PT -> EN), dicthtml-en-pt.zip (EN -> PT)

(이 MobileRead 스레드 참조)

따라서 Penelope로 제작된 사용자 정의 사전을 설치하려면 공식 Kobo 사전 중 하나를 덮어써야 후자를 사용할 가능성이 사실상 없어집니다.

예를 들어, 폴란드어 사전( dicthtml-pl.zip )을 사용하고 싶지만 공식 포르투갈어 사전( dicthtml-pt.zip ) 사용에는 관심이 없다고 가정해 보겠습니다.

PC에서 폴란드어 사전 dicthtml-pl.zip 생성/다운로드하세요.
Kobo 장치에서 설정으로 이동하여 포르투갈어 사전을 활성화하세요.
USB 케이블을 통해 Kobo 장치를 PC에 연결하세요.
파일 관리자를 사용하여 PC에서 dicthtml-pl.zip Kobo 장치의 .kobo/dict/ 디렉터리로 복사하세요. ( .kobo 는 숨겨진 디렉터리입니다. 파일 관리자의 "숨김 파일/디렉터리 표시" 설정을 활성화해야 할 수도 있습니다.)
dicthtml-pl.zip 이름을 dicthtml-pt.zip 으로 바꿉니다.
Kobo를 재부팅하고 폴란드어로 된 책을 열고 단어를 선택하면 정의가 나타납니다. (이 테스트에서는 일반적인 단어를 선택하여 해당 단어가 사전에 있는지 확인하세요!)

Kobo의 펌웨어를 업데이트하면 사용자 정의 사전이 공식 사전으로 덮어쓰여질 수 있습니다. 따라서 사용자 정의 사전의 백업 사본을 PC나 SD 카드와 같은 안전한 장소에 보관하십시오.

이 MobileRead 스레드에서 대부분 Penelope를 사용하여 수행된 사용자 정의 사전 목록을 찾을 수 있습니다.

특허

페넬로페는 버전 2.0.0(2014-06-30)부터 MIT 라이선스에 따라 출시됩니다.

Google Code에서 호스팅하는 이전 버전은 GNU GPL 3 라이선스에 따라 출시되었습니다.

제한 사항 및 누락된 기능

Bookeen에는 사전 형식에 대한 공식 문서가 없습니다(리버스 엔지니어링됨), YMMV
Kobo에는 사전 형식(역설계됨)에 대한 공식 문서가 없습니다. YMMV
Kobo 사전 읽기는 부분적으로 지원됩니다(인덱스는 읽히지만 정의는 암호화/난독화되어 있기 때문에 읽히지 않습니다).
EPUB(3) 사전 읽기는 지원되지 않습니다. 쓰기 부분을 다듬고 리팩토링해야 합니다.
PRC/MOBI(Kindle) 사전 읽기는 지원되지 않습니다.
읽을 수 있는 StarDict 파일에는 몇 가지 제한 사항이 있습니다( format_stardict.py 의 주석 참조).
문서가 완전하지 않음
단위 테스트가 누락되었습니다.

후원자

2015년 12월 : IngleseXpress.it, "Grazie per averci aiutato a pubblicare per Kindle il Dizionario Inglese-Italiano della Pronuncia Scritta Semplificata!"

감사의 말

많은 감사를 드립니다:

코드 개선과 프로젝트 위키의 많은 페이지 설정에 대한 아이디어를 제공한 uwelovesdonna ;
유니코드 파일 이름의 버그를 지적하고 set dict dict() 대신 multiset dict dict() 사용을 제안한 Jens Sadowski ;
Windows와 Python 3의 버그를 지적해 주신 oldnat ;
CSV 사전 읽기용 코드를 제공한 Wolfgang Miller-Reichling
독일어 대조 기능에 대한 아이디어와 초기 코드를 제공한 branok ;
MARISA_BUILD 에 -l 스위치를 전달할 것을 제안한 친구 ;
XML 형식으로 출력할 때 이스케이프 & < > 를 제안한 Lukas Brückner ;
Python 3에서 UTF-8 인코딩을 강제로 제안한 Stephan Lichtenhagen ;
v2.0.1에서 해결된 $CWD(문제 #1)의 종속성을 지적한 niconavarete ;
테스트용 .syn 파일이 포함된 StarDict 사전을 제공해주신 elchamaco님 .