manga image translator 다운로드 - manga image translator 소스 코드 다운로드

이미지/만화 번역기

만화/이미지의 텍스트를 번역합니다.
중국어문명 | 변경 로그
디스코드에 참여하세요 https://discord.gg/Ak8APNy4vb

일부 만화/이미지는 번역되지 않으므로 이 프로젝트가 탄생했습니다.

이미지/만화 번역기
- 샘플
- 온라인 데모
- 부인 성명
- 설치
  - 로컬 설정
    - 핍/벤브
    - Windows 용 추가 지침
  - 도커
    - 웹 서버 호스팅
    - CLI로 사용
    - 번역 비밀 설정
    - Nvidia GPU와 함께 사용
    - 로컬로 건물 짓기
- 용법
  - 배치 모드(기본값)
  - 데모 모드
  - 웹 모드
  - API 모드
- 관련 프로젝트
- 문서
  - 권장 모듈
    - 번역 품질 향상을 위한 팁
  - 옵션
  - 언어 코드 참조
  - 번역가 참고자료
  - GPT 구성 참조
  - 렌더링에 Gimp 사용
  - API 문서
    - 동기 모드
    - 비동기 모드
    - 수동 번역
- 다음 단계
- 우리를 지원하세요
  - 모든 기여자에게 감사드립니다:

샘플

샘플은 항상 업데이트되지 않을 수 있으며 현재 기본 분기 버전을 나타내지 않을 수 있습니다.

원래의	번역됨
(출처 @09ra_19ra)	(마스크)
(출처 @VERTIGRIS_ART)	`--detector ctd` (마스크)
(출처 @hiduki_yayoi)	`--translator none` (마스크)
(출처 @rikak)	(마스크)

온라인 데모

공식 데모(zyddnys 제공): https://touhou.ai/imgtrans/
브라우저 사용자 스크립트(QiroNT 제공): https://greasyfork.org/scripts/437569

어리석은 Google gcp가 내 인스턴스를 계속 다시 시작하기 때문에 때때로 작동하지 않을 수 있습니다. 그렇다면 제가 서비스를 다시 시작할 때까지 기다리시면 됩니다. 이 작업은 최대 24시간이 걸릴 수 있습니다.
이 온라인 데모는 현재 기본 분기 버전을 사용하고 있습니다.

부인 성명

MMDOCR-HighPerformance의 후속 제품입니다.
이것은 취미 프로젝트입니다. 여러분의 기여를 환영합니다!
현재 이것은 단순한 데모일 뿐이며 불완전한 점이 많습니다. 이 프로젝트를 더 좋게 만들려면 여러분의 지원이 필요합니다!
주로 일본어 텍스트 번역을 위해 설계되었지만 중국어, 영어, 한국어도 지원합니다.
인페인팅, 텍스트 렌더링 및 색상화를 지원합니다.

설치

로컬 설정

핍/벤브

 # First, you need to have Python(>=3.8) installed on your system
# The latest version often does not work with some pytorch libraries yet
$ python --version
Python 3.10.6

# Clone this repo
$ git clone https://github.com/zyddnys/manga-image-translator.git

# Create venv
$ python -m venv venv

# Activate venv
$ source venv/bin/activate

# For --use-gpu option go to https://pytorch.org/ and follow
# pytorch installation instructions. Add `--upgrade --force-reinstall`
# to the pip command to overwrite the currently installed pytorch version.

# Install the dependencies
$ pip install -r requirements.txt

모델은 런타임 시 ./models 로 다운로드됩니다.

Windows 용 추가 지침

pip 설치를 시작하기 전에 먼저 Microsoft C++ 빌드 도구(다운로드, 지침)를 설치하십시오. 일부 pip 종속성은 이 도구 없이는 컴파일되지 않습니다. (#114 참조).

Windows에서 cuda를 사용하려면 https://pytorch.org/의 지침에 따라 올바른 pytorch 버전을 설치하십시오.

도커

요구사항:

Docker(CUDA/GPU 가속에는 버전 19.03+ 필요)
Docker Compose( demo/doc 폴더의 파일을 사용하려는 경우 선택 사항)
Nvidia 컨테이너 런타임(CUDA를 사용하려는 경우 선택 사항)

이 프로젝트는 zyddnys/manga-image-translator:main image 아래에서 docker를 지원합니다. 이 도커 이미지에는 프로젝트에 필요한 모든 종속성/모델이 포함되어 있습니다. 이 이미지는 상당히 큽니다(~ 15GB).

웹 서버 호스팅

웹 서버는 (CPU용)을 사용하여 호스팅할 수 있습니다.

docker run -p 5003:5003 -v result:/app/result --ipc=host --rm zyddnys/manga-image-translator:main -l ENG --manga2eng -v --mode web --host=0.0.0.0 --port=5003

또는

docker-compose -f demo/doc/docker-compose-web-with-cpu.yml up

당신이 선호하는 것에 따라. 웹 서버는 포트 5003에서 시작되어야 하며 이미지는 /result 폴더에 있어야 합니다.

CLI로 사용

CLI와 함께 docker를 사용하려면(즉, 배치 모드에서)

docker run -v < targetFolder > :/app/ < targetFolder > -v < targetFolder > -translated:/app/ < targetFolder > -translated  --ipc=host --rm zyddnys/manga-image-translator:main --mode=batch -i=/app/ < targetFolder > < cli flags >

참고: 호스트 시스템의 파일을 참조해야 하는 경우 관련 파일을 컨테이너 내부의 /app 폴더에 볼륨으로 마운트해야 합니다. CLI 경로는 호스트 시스템의 경로 대신 내부 docker 경로 /app/... 여야 합니다.

번역 비밀 설정

일부 번역 서비스에서는 이를 docker 컨테이너에 env var로 전달하도록 설정하기 위해 API 키가 필요합니다. 예를 들어:

docker run --env= " DEEPL_AUTH_KEY=xxx " --ipc=host --rm zyddnys/manga-image-translator:main < cli flags >

Nvidia GPU와 함께 사용

지원되는 GPU와 함께 사용하려면 먼저 초기 Docker 섹션을 읽어보세요. 사용해야 할 몇 가지 특별한 종속성이 있습니다.

다음 플래그가 설정된 컨테이너를 실행하려면 다음을 수행하세요.

docker run ... --gpus=all ... zyddnys/manga-image-translator:main ... --use-gpu

또는 (웹서버 + GPU의 경우)

docker-compose -f demo/doc/docker-compose-web-with-gpu.yml up

로컬로 건물 짓기

도커 이미지를 로컬로 빌드하려면 다음을 실행할 수 있습니다. (머신에 make가 필요합니다.)

make build-image

그런 다음 빌드된 이미지 실행을 테스트하려면

make run-web-server

용법

배치 모드(기본값)

 # use `--use-gpu` for speedup if you have a compatible NVIDIA GPU.
# use `--target-lang <language_code>` to specify a target language.
# use `--inpainter=none` to disable inpainting.
# use `--translator=none` if you only want to use inpainting (blank bubbles)
# replace <path> with the path to the image folder or file.
$ python -m manga_translator -v --translator=google -l ENG -i < path >
# results can be found under `<path_to_image_folder>-translated`.

데모 모드

 # saves singular image into /result folder for demonstration purposes
# use `--mode demo` to enable demo translation.
# replace <path> with the path to the image file.
$ python -m manga_translator --mode demo -v --translator=google -l ENG -i < path >
# result can be found in `result/`.

웹 모드

 # use `--mode web` to start a web server.
$ python -m manga_translator -v --mode web --use-gpu
# the demo will be serving on http://127.0.0.1:5003

API 모드

 # use `--mode web` to start a web server.
$ python -m manga_translator -v --mode api --use-gpu
# the demo will be serving on http://127.0.0.1:5003

문서

권장 모듈

탐지기:

영어: ??
일본: ??
CHHS: ??
한국: ??
--detector ctd 사용하면 감지되는 텍스트 줄의 양이 늘어날 수 있습니다.

OCR:

영어: ??
일본: ??
CHHS: ??
한국어: 48px

역자:

일본 -> 영어: 스고이
CHS -> 영어: ??
CHS -> 일본: ??
일본 -> CHS: ??
영어 -> 일본: ??
영어 -> CHS: ??

인페인터: ??

컬러라이저: mc2

번역 품질 향상을 위한 팁

해상도가 작으면 감지기가 작동하지 않을 수 있으며 이는 불규칙한 텍스트 크기를 잘 포착하지 못합니다. 이를 방지하려면 --upscale-ratio 2 또는 다른 값을 지정하여 업스케일러를 사용할 수 있습니다.
렌더링되는 텍스트가 읽기에 너무 작은 경우 --font-size-minimum 30 지정하거나 감지된 텍스트 버블에 적응하려고 시도하는 --manga2eng 렌더러를 사용하세요.
예를 들어 --font-path fonts/anime_ace_3.ttf 사용하여 글꼴을 지정하세요.

옵션

 -h, --help                                   show this help message and exit
-m, --mode {demo,batch,web,web_client,ws,api}
                                             Run demo in single image demo mode (demo), batch
                                             translation mode (batch), web service mode (web)
-i, --input INPUT [INPUT ...]                Path to an image file if using demo mode, or path to an
                                             image folder if using batch mode
-o, --dest DEST                              Path to the destination folder for translated images in
                                             batch mode
-l, --target-lang {CHS,CHT,CSY,NLD,ENG,FRA,DEU,HUN,ITA,JPN,KOR,PLK,PTB,ROM,RUS,ESP,TRK,UKR,VIN,ARA,CNR,SRP,HRV,THA,IND,FIL}
                                             Destination language
-v, --verbose                                Print debug info and save intermediate images in result
                                             folder
-f, --format {png,webp,jpg,xcf,psd,pdf}      Output format of the translation.
--attempts ATTEMPTS                          Retry attempts on encountered error. -1 means infinite
                                             times.
--ignore-errors                              Skip image on encountered error.
--overwrite                                  Overwrite already translated images in batch mode.
--skip-no-text                               Skip image without text (Will not be saved).
--model-dir MODEL_DIR                        Model directory (by default ./models in project root)
--use-gpu                                   Turn on/off gpu
--use-gpu-limited                           Turn on/off gpu (excluding offline translator)
--detector {default,ctd,craft,none}          Text detector used for creating a text mask from an
                                             image, DO NOT use craft for manga, it's not designed
                                             for it
--ocr {32px,48px,48px_ctc,mocr}              Optical character recognition (OCR) model to use
--use-mocr-merge                             Use bbox merge when Manga OCR inference.
--inpainter {default,lama_large,lama_mpe,sd,none,original}
                                             Inpainting model to use
--upscaler {waifu2x,esrgan,4xultrasharp}     Upscaler to use. --upscale-ratio has to be set for it
                                             to take effect
--upscale-ratio UPSCALE_RATIO                Image upscale ratio applied before detection. Can
                                             improve text detection.
--colorizer {mc2}                            Colorization model to use.
--translator {google,youdao,baidu,deepl,papago,caiyun,gpt3,gpt3.5,gpt4,none,original,offline,nllb,nllb_big,sugoi,jparacrawl,jparacrawl_big,m2m100,m2m100_big,sakura}
                                             Language translator to use
--translator-chain TRANSLATOR_CHAIN          Output of one translator goes in another. Example:
                                             --translator-chain "google:JPN;sugoi:ENG".
--selective-translation SELECTIVE_TRANSLATION
                                             Select a translator based on detected language in
                                             image. Note the first translation service acts as
                                             default if the language isn't defined. Example:
                                             --translator-chain "google:JPN;sugoi:ENG".
--revert-upscaling                           Downscales the previously upscaled image after
                                             translation back to original size (Use with --upscale-
                                             ratio).
--detection-size DETECTION_SIZE              Size of image used for detection
--det-rotate                                 Rotate the image for detection. Might improve
                                             detection.
--det-auto-rotate                            Rotate the image for detection to prefer vertical
                                             textlines. Might improve detection.
--det-invert                                 Invert the image colors for detection. Might improve
                                             detection.
--det-gamma-correct                          Applies gamma correction for detection. Might improve
                                             detection.
--unclip-ratio UNCLIP_RATIO                  How much to extend text skeleton to form bounding box
--box-threshold BOX_THRESHOLD                Threshold for bbox generation
--text-threshold TEXT_THRESHOLD              Threshold for text detection
--min-text-length MIN_TEXT_LENGTH            Minimum text length of a text region
--no-text-lang-skip                          Dont skip text that is seemingly already in the target
                                             language.
--inpainting-size INPAINTING_SIZE            Size of image used for inpainting (too large will
                                             result in OOM)
--inpainting-precision {fp32,fp16,bf16}      Inpainting precision for lama, use bf16 while you can.
--colorization-size COLORIZATION_SIZE        Size of image used for colorization. Set to -1 to use
                                             full image size
--denoise-sigma DENOISE_SIGMA                Used by colorizer and affects color strength, range
                                             from 0 to 255 (default 30). -1 turns it off.
--mask-dilation-offset MASK_DILATION_OFFSET  By how much to extend the text mask to remove left-over
                                             text pixels of the original image.
--font-size FONT_SIZE                        Use fixed font size for rendering
--font-size-offset FONT_SIZE_OFFSET          Offset font size by a given amount, positive number
                                             increase font size and vice versa
--font-size-minimum FONT_SIZE_MINIMUM        Minimum output font size. Default is
                                             image_sides_sum/200
--font-color FONT_COLOR                      Overwrite the text fg/bg color detected by the OCR
                                             model. Use hex string without the "#" such as FFFFFF
                                             for a white foreground or FFFFFF:000000 to also have a
                                             black background around the text.
--line-spacing LINE_SPACING                  Line spacing is font_size * this value. Default is 0.01
                                             for horizontal text and 0.2 for vertical.
--force-horizontal                           Force text to be rendered horizontally
--force-vertical                             Force text to be rendered vertically
--align-left                                 Align rendered text left
--align-center                               Align rendered text centered
--align-right                                Align rendered text right
--uppercase                                  Change text to uppercase
--lowercase                                  Change text to lowercase
--no-hyphenation                             If renderer should be splitting up words using a hyphen
                                             character (-)
--manga2eng                                  Render english text translated from manga with some
                                             additional typesetting. Ignores some other argument
                                             options
--gpt-config GPT_CONFIG                      Path to GPT config file, more info in README
--use-mtpe                                   Turn on/off machine translation post editing (MTPE) on
                                             the command line (works only on linux right now)
--save-text                                  Save extracted text and translations into a text file.
--save-text-file SAVE_TEXT_FILE              Like --save-text but with a specified file path.
--filter-text FILTER_TEXT                    Filter regions by their text with a regex. Example
                                             usage: --text-filter ".*badtext.*"
--pre-dict FILe_PATH                         Path to the pre-translation dictionary file. One entry per line,
                                             Comments can be added with `#` and `//`.
                                             usage: //Example
                                                    dog cat #Example
                                                    abc def
                                                    abc
--post-dict FILE_PATH                        Path to the post-translation dictionary file. Same as above.
--skip-lang                                  Skip translation if source image is one of the provide languages, 
                                             use comma to separate multiple languages. Example: JPN,ENG
--prep-manual                                Prepare for manual typesetting by outputting blank,
                                             inpainted images, plus copies of the original for
                                             reference
--font-path FONT_PATH                        Path to font file
--gimp-font GIMP_FONT                        Font family to use for gimp rendering.
--host HOST                                  Used by web module to decide which host to attach to
--port PORT                                  Used by web module to decide which port to attach to
--nonce NONCE                                Used by web module as secret for securing internal web
                                             server communication
--ws-url WS_URL                              Server URL for WebSocket mode
--save-quality SAVE_QUALITY                  Quality of saved JPEG image, range from 0 to 100 with
                                             100 being best
--ignore-bubble IGNORE_BUBBLE                The threshold for ignoring text in non bubble areas,
                                             with valid values ranging from 1 to 50, does not ignore
                                             others. Recommendation 5 to 10. If it is too low,
                                             normal bubble areas may be ignored, and if it is too
                                             large, non bubble areas may be considered normal
                                             bubbles

언어 코드 참조

--target-lang 또는 -l 인수에 사용됩니다.

 CHS : Chinese (Simplified)
CHT : Chinese (Traditional)
CSY : Czech
NLD : Dutch
ENG : English
FRA : French
DEU : German
HUN : Hungarian
ITA : Italian
JPN : Japanese
KOR : Korean
PLK : Polish
PTB : Portuguese (Brazil)
ROM : Romanian
RUS : Russian
ESP : Spanish
TRK : Turkish
UKR : Ukrainian
VIN : Vietnames
ARA : Arabic
SRP : Serbian
HRV : Croatian
THA : Thai
IND : Indonesian
FIL : Filipino (Tagalog)

번역가 참고자료

이름	API 키	오프라인	메모
~~Google~~			일시적으로 비활성화됨
유다오	✔️		`YOUDAO_APP_KEY` 및 `YOUDAO_SECRET_KEY` 필요합니다.
바이두	✔️		`BAIDU_APP_ID` 및 `BAIDU_SECRET_KEY` 필요합니다.
깊은	✔️		`DEEPL_AUTH_KEY` 필요
카이윤	✔️		`CAIYUN_TOKEN` 필요
gpt3	✔️		text-davinci-003을 구현합니다. `OPENAI_API_KEY` 필요
gpt3.5	✔️		gpt-3.5-turbo를 구현합니다. `OPENAI_API_KEY` 필요
gpt4	✔️		gpt-4를 구현합니다. `OPENAI_API_KEY` 필요
파파고
사쿠라			`SAKURA_API_BASE` 필요
오프라인		✔️	언어에 가장 적합한 오프라인 번역기를 선택합니다
스고이		✔️	스고이 V4.0 모델
m2m100		✔️	모든 언어를 지원합니다
m2m100_big		✔️
없음		✔️	빈 텍스트로 번역
원래의		✔️	원본 텍스트 유지

API 키: 변환기가 환경 변수로 설정하기 위해 API 키를 요구하는지 여부입니다. 이를 위해 다음과 같이 API 키가 포함된 프로젝트 루트 디렉터리에 .env 파일을 만들 수 있습니다.

 OPENAI_API_KEY = sk-xxxxxxx...
DEEPL_AUTH_KEY = xxxxxxxx...

오프라인: 번역기를 오프라인으로 사용할 수 있는지 여부입니다.
Sugoi는 mingshiba가 만들었습니다. https://www.patreon.com/mingshiba에서 그를 지원해 주세요.

GPT 구성 참조

--gpt-config 인수에 사용됩니다.

 # The prompt being feed into GPT before the text to translate.
# Use {to_lang} to indicate where the target language name should be inserted.
# Note: ChatGPT models don't use this prompt.
prompt_template : >
  Please help me to translate the following text from a manga to {to_lang}
  (if it's already in {to_lang} or looks like gibberish you have to output it as it is instead):n

# What sampling temperature to use, between 0 and 2.
# Higher values like 0.8 will make the output more random,
# while lower values like 0.2 will make it more focused and deterministic.
temperature : 0.5

# An alternative to sampling with temperature, called nucleus sampling,
# where the model considers the results of the tokens with top_p probability mass.
# So 0.1 means only the tokens comprising the top 10% probability mass are considered.
top_p : 1

# The prompt being feed into ChatGPT before the text to translate.
# Use {to_lang} to indicate where the target language name should be inserted.
# Tokens used in this example: 57+
chat_system_template : >
  You are a professional translation engine, 
  please translate the story into a colloquial, 
  elegant and fluent content, 
  without referencing machine translations. 
  You must only translate the story, never interpret it.
  If there is any issue in the text, output it as is.

  Translate to {to_lang}.

# Samples being feed into ChatGPT to show an example conversation.
# In a [prompt, response] format, keyed by the target language name.
#
# Generally, samples should include some examples of translation preferences, and ideally
# some names of characters it's likely to encounter.
#
# If you'd like to disable this feature, just set this to an empty list.
chat_sample :
  Simplified Chinese : # Tokens used in this example: 88 + 84
    - <|1|>恥ずかしい… 目立ちたくない… 私が消えたい…
      <|2|>きみ… 大丈夫⁉
      <|3|>なんだこいつ 空気読めて ないのか…？
    - <|1|>好尴尬…我不想引人注目…我想消失…
      <|2|>你…没事吧⁉
      <|3|>这家伙怎么看不懂气氛的…？

# Overwrite configs for a specific model.
# For now the list is: gpt3, gpt35, gpt4
gpt35 :
  temperature : 0.3

렌더링에 Gimp 사용

출력 형식을 { xcf , psd , pdf }로 설정하면 Gimp를 사용하여 파일을 생성합니다.

Windows에서는 Gimp 2.x가 C:Users<Username>AppDataLocalProgramsGimp 2 에 설치되어 있다고 가정합니다.

결과 .xcf 파일에는 원본 이미지가 가장 낮은 레이어로 포함되고 인페인팅이 별도의 레이어로 포함됩니다. 번역된 텍스트 상자에는 쉽게 액세스할 수 있도록 원본 텍스트가 레이어 이름으로 포함된 자체 레이어가 있습니다.

제한사항:

Gimp는 .psd 파일을 저장할 때 텍스트 레이어를 일반 이미지로 바꿉니다.
회전된 텍스트는 Gimp에서 잘 처리되지 않습니다. 회전된 텍스트 상자를 편집할 때 외부 프로그램에 의해 수정되었다는 팝업도 표시됩니다.
글꼴 모음은 --gimp-font 인수를 사용하여 별도로 제어됩니다.

API 문서

API V2

 # use `--mode api` to start a web server.
$ python -m manga_translator -v --mode api --use-gpu
# the api will be serving on http://127.0.0.1:5003

Api는 json(post)과 멀티파트를 허용합니다.
API 엔드포인트는 /colorize_translate , /inpaint_translate , /translate , /get_text 입니다.
API에 대한 유효한 인수는 다음과 같습니다.

 // These are taken from args.py. For more info see README.md
detector: String
ocr: String
inpainter: String
upscaler: String
translator: String 
target_language: String
upscale_ratio: Integer
translator_chain: String
selective_translation: String
attempts: Integer
detection_size: Integer // 1024 => 'S', 1536 => 'M', 2048 => 'L', 2560 => 'X'
text_threshold: Float
box_threshold: Float
unclip_ratio: Float
inpainting_size: Integer
det_rotate: Bool
det_auto_rotate: Bool
det_invert: Bool
det_gamma_correct: Bool
min_text_length: Integer
colorization_size: Integer
denoise_sigma: Integer
mask_dilation_offset: Integer
ignore_bubble: Integer
gpt_config: String
filter_text: String
overlay_type: String

// These are api specific args
direction: String // {'auto', 'h', 'v'}
base64Images: String //Image in base64 format
image: Multipart // image upload from multipart
url: String // an url string

수동 번역은 기계 번역을 인간 번역가로 대체합니다. 웹 모드를 사용할 때 기본 수동 번역 데모는 http://127.0.0.1:5003/manual에서 찾을 수 있습니다.

API

데모에서는 동기 모드와 비동기 모드라는 두 가지 번역 서비스 모드를 제공합니다.
동기 모드에서는 번역 작업이 완료되면 HTTP POST 요청이 완료됩니다.
비동기 모드에서는 HTTP POST 요청이 task_id 로 즉시 응답합니다. 이 task_id 사용하여 번역 작업 상태를 폴링할 수 있습니다.

동기 모드

양식 데이터 file:<content-of-image> 사용하여 양식 요청을 http://127.0.0.1:5003/run에 게시합니다.
응답을 기다리세요
결과 task_id 사용하여 result/ 디렉터리에서 번역 결과를 찾습니다. 예를 들어 Nginx를 사용하여 result/ 노출합니다.

비동기 모드

양식 데이터 file:<content-of-image> 사용하여 양식 요청을 http://127.0.0.1:5003/submit에 게시합니다.
번역 task_id 획득
JSON {"taskid": <task-id>} http://127.0.0.1:5003/task-state에 게시하여 번역 작업 상태를 폴링합니다.
결과 상태가 finished , error 또는 error-lang 중 하나일 때 번역이 완료됩니다.
result/ 디렉토리에서 번역 결과를 찾습니다. 예를 들어 Nginx를 사용하여 result/ 노출합니다.

수동 번역

양식 데이터 file:<content-of-image> 가 포함된 양식 요청을 http://127.0.0.1:5003/manual-translate에 게시하고 응답을 기다립니다.

다음과 같은 JSON 응답을 받게 됩니다.

{
  "task_id" : " 12c779c9431f954971cae720eb104499 " ,
  "status" : " pending " ,
  "trans_result" : [
    {
      "s" : " ☆上司来ちゃった…… " ,
      "t" : " "
    }
  ]
}

번역된 텍스트를 입력하세요:

{
  "task_id" : " 12c779c9431f954971cae720eb104499 " ,
  "status" : " pending " ,
  "trans_result" : [
    {
      "s" : " ☆上司来ちゃった…… " ,
      "t" : " ☆Boss is here... "
    }
  ]
}

JSON을 http://127.0.0.1:5003/post-manual-result로 번역한 후 응답을 기다립니다.
그런 다음 result/ 디렉토리에서 번역 결과를 찾을 수 있습니다. 예를 들어 Nginx를 사용하여 result/ 노출합니다.

다음 단계

다음에 수행해야 할 작업 목록에 기여해 주시기 바랍니다.

거의 완벽한 결과를 얻으려면 확산 모델 기반 인페인팅을 사용하지만 속도가 훨씬 느려질 수 있습니다.
중요!!!도움이 필요합니다!!! 현재 텍스트 렌더링 엔진은 거의 사용할 수 없습니다. 텍스트 렌더링을 개선하려면 여러분의 도움이 필요합니다!
텍스트 렌더링 영역은 말풍선이 아닌 감지된 텍스트 줄에 따라 결정됩니다.
이는 말풍선이 없는 이미지에 적용되지만 번역된 영어 텍스트를 어디에 넣을지 결정하는 것이 불가능합니다. 이 문제를 해결하는 방법을 모르겠습니다.
Ryotaet al. 다중 모드 기계 번역을 사용하여 제안한 경우 맞춤형 NMT 모델을 구축하기 위한 ViT 기능을 추가할 수 있습니다.
이 프로젝트가 비디오용으로 작동하도록 만드세요(C++로 코드를 다시 작성하고 GPU/기타 하드웨어 NN 가속기를 사용하세요).
비디오에서 하드 자막을 감지하고 엉덩이 파일을 생성하고 완전히 제거하는 데 사용됩니다.
~~딥러닝이 아닌 알고리즘을 이용한 마스크 미세화, 현재 CRF 기반 알고리즘을 테스트 중입니다.~~
~~각진 텍스트 영역 병합은 현재 지원되지 않습니다.~~
pip 저장소 생성