-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue with RTX A6000 execution #55
Comments
same issues 2025-02-28 14:11:17,944 - main - WARNING - Attempt 66: All connection attempts failed |
i have same issues, any solution? |
Same issues here. Tested with RTX3090 24GB. |
SGLang needs to download the model weights and set it up, thats whats happening in the background. The warnings are from olmocr waiting for that setup to finish. Presumably if you wait long enough it will connect. |
Yeah, please wait longer for the download and init of the weights from hugging face and for sglang to init. It can take 2-3 minutes on a cold start. |
After waiting, I got this error. Hardware : RTX4090 24GB I just follow the installation setup and run using -
Error stack:
|
Random question, can you run this python code to "predownload" the model into your hugging face cache, and then restart? import torch
import base64
import urllib.request
from io import BytesIO
from PIL import Image
from transformers import AutoProcessor, Qwen2VLForConditionalGeneration
# Initialize the model
model = Qwen2VLForConditionalGeneration.from_pretrained("allenai/olmOCR-7B-0225-preview", torch_dtype=torch.bfloat16).eval()
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device) |
🐛 Describe the bug
I am using:
nvidia RTX A6000 48GB
Followed the instructions carefully, all seemed to install and be fine.
CUDA_DEVICE_ORDER=PCI_BUS_ID python -m olmocr.pipeline ./localworkspace --pdfs /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf
Any ideas???
something to do with sglang ?? not installing - although it seems to be in the conda env, but not running properly ??
Gives the following error:
(olmocr) pop@pop-os:~/Documents/olmocr$ CUDA_DEVICE_ORDER=PCI_BUS_ID python -m olmocr.pipeline ./localworkspace --pdfs /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf
INFO:olmocr.check:pdftoppm is installed and working.
2025-02-28 13:00:42,979 - main - INFO - Got --pdfs argument, going to add to the work queue
2025-02-28 13:00:42,979 - main - INFO - Loading file at /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf as PDF document
2025-02-28 13:00:42,979 - main - INFO - Found 1 total pdf paths to add
Sampling PDFs to calculate optimal length: 100%|███████████████| 1/1 [00:00<00:00, 178.34it/s]
2025-02-28 13:00:42,985 - main - INFO - Calculated items_per_group: 33 based on average pages per PDF: 15.00
INFO:olmocr.work_queue:Found 1 total paths
INFO:olmocr.work_queue:0 new paths to add to the workspace
2025-02-28 13:00:43,106 - main - INFO - Starting pipeline with PID 66979
INFO:olmocr.work_queue:Initialized local queue with 1 work items
2025-02-28 13:00:43,168 - main - WARNING - Attempt 1: All connection attempts failed
2025-02-28 13:00:44,193 - main - WARNING - Attempt 2: All connection attempts failed
2025-02-28 13:00:45,228 - main - WARNING - Attempt 3: All connection attempts failed
2025-02-28 13:00:46,273 - main - WARNING - Attempt 4: All connection attempts failed
2025-02-28 13:00:47,299 - main - WARNING - Attempt 5: All connection attempts failed
2025-02-28 13:00:48,346 - main - WARNING - Attempt 6: All connection attempts failed
2025-02-28 13:00:48,469 - main - INFO - [2025-02-28 13:00:48] server_args=ServerArgs(model_path='allenai/olmOCR-7B-0225-preview', tokenizer_path='allenai/olmOCR-7B-0225-preview', tokenizer_mode='auto', load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, quantization=None, context_length=None, device='cuda', served_model_name='allenai/olmOCR-7B-0225-preview', chat_template='qwen2-vl', is_embedding=False, revision=None, skip_tokenizer_init=False, host='127.0.0.1', port=30024, mem_fraction_static=0.8, max_running_requests=None, max_total_tokens=None, chunked_prefill_size=2048, max_prefill_tokens=16384, schedule_policy='lpm', schedule_conservativeness=1.0, cpu_offload_gb=0, prefill_only_one_req=False, tp_size=1, stream_interval=1, stream_output=False, random_seed=136363370, constrained_json_whitespace_pattern=None, watchdog_timeout=300, download_dir=None, base_gpu_id=0, log_level='info', log_level_http='warning', log_requests=False, show_time_cost=False, enable_metrics=False, decode_log_interval=40, api_key=None, file_storage_pth='sglang_storage', enable_cache_report=False, dp_size=1, load_balance_method='round_robin', ep_size=1, dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', lora_paths=None, max_loras_per_batch=8, attention_backend='flashinfer', sampling_backend='flashinfer', grammar_backend='outlines', speculative_draft_model_path=None, speculative_algorithm=None, speculative_num_steps=5, speculative_num_draft_tokens=64, speculative_eagle_topk=8, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, disable_jump_forward=False, disable_cuda_graph=False, disable_cuda_graph_padding=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, disable_mla=False, disable_overlap_schedule=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_ep_moe=False, enable_torch_compile=False, torch_compile_max_bs=32, cuda_graph_max_bs=8, cuda_graph_bs=None, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, tool_call_parser=None)
2025-02-28 13:00:49,387 - main - WARNING - Attempt 7: All connection attempts failed
2025-02-28 13:00:50,094 - main - INFO - Using a slow image processor as
use_fast
is unset and a slow processor was saved with this model.use_fast=True
will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor withuse_fast=False
.2025-02-28 13:00:50,544 - main - WARNING - Attempt 8: All connection attempts failed
2025-02-28 13:00:51,567 - main - WARNING - Attempt 9: All connection attempts failed
2025-02-28 13:00:51,900 - main - INFO - [2025-02-28 13:00:51] Use chat template for the OpenAI-compatible API server: qwen2-vl
2025-02-28 13:00:52,614 - main - WARNING - Attempt 10: All connection attempts failed
2025-02-28 13:00:53,637 - main - WARNING - Attempt 11: All connection attempts failed
2025-02-28 13:00:54,660 - main - WARNING - Attempt 12: All connection attempts failed
2025-02-28 13:00:55,683 - main - WARNING - Attempt 13: All connection attempts failed
Versions
hope this is something obvious !!!
The text was updated successfully, but these errors were encountered: