[AI] (5) Print out the Ollama request information

Little scum · Posted on 2/6/2025 9:48:36 PM

Requirements: I deployed the DeepSeek-R1 model using Ollama and wanted to view the request information from some plugins to understand the details. For example: Open WebUI, continue, cline, Roo Code, etc.

Review:

【AI】(3) Tencent Cloud Deploys DeepSeek-R1 with HAI tutorial
https://www.itsvse.com/thread-10931-1-1.html

[AI] (4) Use Open WebUI to call the DeepSeek-R1 model
https://www.itsvse.com/thread-10934-1-1.html

In order to print out the input request on the server side, you need to enable Debug mode. edit/etc/systemd/system/ollama.service.d/override.conffile, add the following configuration:

Login is visible.

Reload and start the ollama service with the following command:

Login is visible.

Use journalctl to view the service output logs with the following command:

Login is visible.

Use Open WebUI to call ollama for testing, as shown in the image below:

The logs are as follows:

Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:25:48 | 200 |  6.186257471s |    172.18.0.2 | POST    "/api/chat"
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.834+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.835+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt=<|User|>My name is Xiao Zha, who are you? <｜Assistant｜>
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.836+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=728 prompt=13 used=2 remaining=11
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:02 | 200 |  7.642182053s |    172.18.0.2 | POST    "/api/chat"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.491+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.491+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<｜User｜>### Task:\nGenerate a concise, 3-5 word title with an emoji summarizing the chat history.\n### Guidelines:\n- The title should clearly represent the main theme or subject of the conversation.\ n- Use emojis that enhance understanding of the topic, but avoid quotation marks or special formatting.\n- Write the title in the chat's primary language; default to English if multilingual.\n- Prioritize accuracy over excessive creativity; keep it clear and simple.\n### Output:\nJSON format: { \"title\": \"your concise title here\" }\n### Examples:\n- { \"title\": \" Stock Market Trends\" },\n- { \"title\": \" Perfect Chocolate Chip Recipe\" },\n- { \"title\": \"Evolution of Music Streaming\" },\n- { \"title\": \"Remote Work Productivity Tips\" },\n- { \"title\": \"Artificial Intelligence in Healthcare\" },\n- { \" title\": \" Video Game Development Insights\" }\n### Chat History:\n<chat_history>\nUSER: My name is Xiao Zha, who are you? \nASSISTANT: Hello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? \n</chat_history><｜Assistant｜>"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.495+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=1 cache=567 prompt=312 used=6 remaining= 306
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:14 | 200 | 12.263297485s |    172.18.0.2 | POST    "/api/chat"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.769+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.769+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<｜User｜>### Task:\nGenerate 1-3 broad tags categorizing the main themes of the chat history, along with 1-3 more specific subtopic tags.\n\n### Guidelines:\n- Start with high-level domains (e.g. Science, Technology, Philosophy, Arts, Politics, Business, Health, Sports, Entertainment, Education)\n- Consider including relevant subfields/subdomains if they are strongly represented throughout the conversation\n- If content is too short (less than 3 messages) or too diverse, use only [\"General\"]\n- Use the chat's primary language; default to English if multilingual\n- Prioritize accuracy over specificity\n\n### Output:\nJSON format: { \"tags\": [\"tag1\", \"tag2\", \"tag3\"] }\n\n### Chat History:\n<chat_history>\nUSER: My name is Xiao Zha, who are you? \nASSISTANT: Hello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? \n</chat_history><｜Assistant｜>"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.773+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=1 cache=637 prompt=249 used=7 remaining= 242
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:17.717+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:17.718+08:00 level=DEBUG source=server.go:966 msg="new runner detected, loading model for cgo tokenization"
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from /data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest))
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 0:                      general.architecture str             = qwen2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 1:                            general.type str             = model
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 2:                            general.name str             = DeepSeek R1 Distill Qwen 32B
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 3:                         general.basename str             = DeepSeek-R1-Distill-Qwen
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 4:                      general.size_label str             = 32B
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 5:                         qwen2.block_count u32             = 64
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 6:                      qwen2.context_length u32             = 131072
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 7:                   qwen2.embedding_length u32             = 5120
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 8:                qwen2.feed_forward_length u32             = 27648
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv 9:                qwen2.attention.head_count u32             = 40
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  10:             qwen2.attention.head_count_kv u32             = 8
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  11:                      qwen2.rope.freq_base f32             = 1000000.000000
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  12:    qwen2.attention.layer_norm_rms_epsilon f32             = 0.000010
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  13:                         general.file_type u32             = 15
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  14:                      tokenizer.ggml.model str             = gpt2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  15:                      tokenizer.ggml.pre str             = deepseek-r1-qwen
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  16:                   tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  17:                tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  18:                   tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  19:             tokenizer.ggml.bos_token_id u32             = 151646
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  20:             tokenizer.ggml.eos_token_id u32             = 151643
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  21:          tokenizer.ggml.padding_token_id u32             = 151643
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  22:             tokenizer.ggml.add_bos_token bool          = true
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  23:             tokenizer.ggml.add_eos_token bool          = false
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  24:                   tokenizer.chat_template str             = {% if not add_generation_prompt is de...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  25:             general.quantization_version u32             = 2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type  f32:  321 tensors
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type q4_K:  385 tensors
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type q6_K: 65 tensors
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: missing or unrecognized pre-tokenizer type, using: 'default'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: special tokens cache size = 22
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: token to piece cache size = 0.9310 MB
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: format          = GGUF V3 (latest)
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: arch          = qwen2
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: vocab type    = BPE
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: n_vocab       = 152064
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: n_merges       = 151387
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: vocab_only    = 1
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model type    = ? B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model ftype    = all F32
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model params    = 32.76 B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model size    = 18.48 GiB (4.85 BPW)
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: general.name    = DeepSeek R1 Distill Qwen 32B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: BOS token       = 151646 '<｜begin▁of▁sentence｜>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOS token       = 151643 '<｜end▁of▁sentence｜>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOT token       = 151643 '<｜end▁of▁sentence｜>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: PAD token       = 151643 '<｜end▁of▁sentence｜>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: LF token       = 148848 'ÄĬ'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM PRE token = 151659 '<|fim_prefix|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM SUF token = 151661 '<|fim_suffix|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM MID token = 151660 '<|fim_middle|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM PAD token = 151662 '<|fim_pad|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM REP token = 151663 '<|repo_name|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM SEP token = 151664 '<|file_sep|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token       = 151643 '<｜end▁of▁sentence｜>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token       = 151662 '<|fim_pad|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token       = 151663 '<|repo_name|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token       = 151664 '<|file_sep|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: max token length = 256
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llama_model_load: vocab only - skipping tensors
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:18.440+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<|User|>My name is Xiao Zha, who are you? <|Assistant|>\nHello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? <|end of sentence|><|User|>Hello DeepSeek-R1<|Assistant|>"
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:18.491+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=223 prompt=64 used=13 remaining= 51
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:24 | 200 |  6.737131375s |    172.18.0.2 | POST    "/api/chat"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.426+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.426+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=1
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:24 | 200 | 10.172441322s |    172.18.0.2 | POST    "/api/chat"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0

Reference:
The hyperlink login is visible.
The hyperlink login is visible.

Little scum · Posted on 2/6/2025 9:53:55 PM

You can also edit the ollama service file using the following command:

Login is visible.

When you run this command, a text editor (usually vi or nano) opens and lets you edit the /etc/systemd/system/ollama.service file.

Little scum · Posted on 2/7/2025 9:08:25 AM

Linux looks at the log output of the Systemd service
https://www.itsvse.com/thread-10154-1-1.html

[AI] (5) Print out the Ollama request information

Related Posts