/lmg/ - Local Models General
Anonymous 01/21/25(Tue)01:49:26 | 456 comments | 54 images | 🔒 Locked
/lmg/ - a general dedicated to the discussion and development of local language models.
Previous threads: >>103971523 & >>103967199
â–ºNews
>(01/20) DeepSeek releases R1, R1 Zero, & finetuned Qwen and Llama models: https://hf.co/deepseek-ai/DeepSeek-R1-Zero
>(01/17) Nvidia AceInstruct, finetuned on Qwen2.5-Base: https://hf.co/nvidia/AceInstruct-72B
>(01/16) OuteTTS-0.3 released with voice cloning & punctuation support: https://hf.co/collections/OuteAI/outetts-03-6786b1ebc7aeb757bc17a2fa
>(01/15) InternLM3-8B-Instruct released with deep thinking capability: https://hf.co/internlm/internlm3-8b-instruct
â–ºNews Archive: https://rentry.org/lmg-news-archive
â–ºGlossary: https://rentry.org/lmg-glossary
â–ºLinks: https://rentry.org/LocalModelsLinks
â–ºOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png
â–ºGetting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
â–ºFurther Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers
â–ºBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
â–ºTools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
â–ºText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Previous threads: >>103971523 & >>103967199
â–ºNews
>(01/20) DeepSeek releases R1, R1 Zero, & finetuned Qwen and Llama models: https://hf.co/deepseek-ai/DeepSeek-
>(01/17) Nvidia AceInstruct, finetuned on Qwen2.5-Base: https://hf.co/nvidia/AceInstruct-72
>(01/16) OuteTTS-0.3 released with voice cloning & punctuation support: https://hf.co/collections/OuteAI/ou
>(01/15) InternLM3-8B-Instruct released with deep thinking capability: https://hf.co/internlm/internlm3-8b
â–ºNews Archive: https://rentry.org/lmg-news-archive
â–ºGlossary: https://rentry.org/lmg-glossary
â–ºLinks: https://rentry.org/LocalModelsLinks
â–ºOfficial /lmg/ card: https://files.catbox.moe/cbclyf.png
â–ºGetting Started
https://rentry.org/lmg-lazy-getting
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWeb
https://rentry.org/tldrhowtoquant
â–ºFurther Learning
https://rentry.org/machine-learning
https://rentry.org/llm-training
https://rentry.org/LocalModelsPaper
â–ºBenchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/lea
Code Editing: https://aider.chat/docs/leaderboard
Context Length: https://github.com/hsiehjackson/RUL
Japanese: https://hf.co/datasets/lmg-anon/vnt
Censorbench: https://codeberg.org/jts2323/censor
GPUs: https://github.com/XiongjieDai/GPU-
â–ºTools
Alpha Calculator: https://desmos.com/calculator/ffngl
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-M
Sampler Visualizer: https://artefact2.github.io/llm-sam
â–ºText Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-g
https://github.com/LostRuins/kobold
https://github.com/ggerganov/llama.
https://github.com/theroyallab/tabb
https://github.com/vllm-project/vll
Anonymous 01/21/25(Tue)01:49:50 No.103976962
â–ºRecent Highlights from the Previous Thread: >>103971523
--DeepSeek's mathematical capabilities and reasoning process:
>103976590 >103976654 >103976711 >103976735 >103976782
--Generating models instead of training them, and understanding LLM structures:
>103972406 >103972449 >103972480 >103972540 >103972585
--Anon praises R1-Distill-Llama-70b for its exceptional performance on image captioning task:
>103973508
--DeepSeek R1 pipeline and synthetic data discussion:
>103973925 >103973970 >103973997 >103974996 >103973971 >103973987 >103974076 >103974253 >103974101 >103974091 >103974129 >103974222
--LLM benchmark and model performance discussion:
>103975433 >103975440 >103975536 >103975555 >103975624 >103975550 >103975563 >103975872 >103975898 >103975936 >103976151 >103975597
--OuteTTS-0.3 discussion and comparison with other TTS models:
>103971948 >103972091 >103972121 >103972310 >103973064 >103973557 >103973534
--DeepSeek-R1-Distill-Qwen-32B model performance and capabilities discussion:
>103973083 >103973102 >103973122 >103973156 >103973164 >103973168 >103973189 >103973201 >103973298
--Deepseek-distill models and instruct template discussion:
>103973921 >103973967 >103974583 >103974598 >103974619 >103974797 >103974844
--LLM implementation struggles with data categorization and explicit content:
>103974929 >103974934 >103975049 >103975108 >103975163 >103975203
--Anons discuss and compare 70B and 32B R1 tunes:
>103973002 >103973009 >103973025 >103973032 >103973038 >103973029
--R1's limitations in simulating a role-playing game and generating random numbers:
>103971842 >103972041 >103972058 >103972161 >103972201
--AI model benchmark on Python AOC 2024 task:
>103974480 >103974524 >103974533 >103974685
--Logs: R1:
>103971566 >103975689 >103975859 >103976230 >103976365 >103976646
--Miku (free space):
>103972121 >103973653 >103976834
â–ºRecent Highlight Posts from the Previous Thread: >>103971526
Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
--DeepSeek's mathematical capabilities and reasoning process:
>103976590 >103976654 >103976711 >103976735 >103976782
--Generating models instead of training them, and understanding LLM structures:
>103972406 >103972449 >103972480 >103972540 >103972585
--Anon praises R1-Distill-Llama-70b for its exceptional performance on image captioning task:
>103973508
--DeepSeek R1 pipeline and synthetic data discussion:
>103973925 >103973970 >103973997 >103974996 >103973971 >103973987 >103974076 >103974253 >103974101 >103974091 >103974129 >103974222
--LLM benchmark and model performance discussion:
>103975433 >103975440 >103975536 >103975555 >103975624 >103975550 >103975563 >103975872 >103975898 >103975936 >103976151 >103975597
--OuteTTS-0.3 discussion and comparison with other TTS models:
>103971948 >103972091 >103972121 >103972310 >103973064 >103973557 >103973534
--DeepSeek-R1-Distill-Qwen-32B model performance and capabilities discussion:
>103973083 >103973102 >103973122 >103973156 >103973164 >103973168 >103973189 >103973201 >103973298
--Deepseek-distill models and instruct template discussion:
>103973921 >103973967 >103974583 >103974598 >103974619 >103974797 >103974844
--LLM implementation struggles with data categorization and explicit content:
>103974929 >103974934 >103975049 >103975108 >103975163 >103975203
--Anons discuss and compare 70B and 32B R1 tunes:
>103973002 >103973009 >103973025 >103973032 >103973038 >103973029
--R1's limitations in simulating a role-playing game and generating random numbers:
>103971842 >103972041 >103972058 >103972161 >103972201
--AI model benchmark on Python AOC 2024 task:
>103974480 >103974524 >103974533 >103974685
--Logs: R1:
>103971566 >103975689 >103975859 >103976230 >103976365 >103976646
--Miku (free space):
>103972121 >103973653 >103976834
â–ºRecent Highlight Posts from the Previous Thread: >>103971526
Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous 01/21/25(Tue)01:51:21 No.103976979
>>103976963
Also it was relatively uncensored. SD3 just came out and couldnt do yoga poses. So it was perfect timing.
Also it was relatively uncensored. SD3 just came out and couldnt do yoga poses. So it was perfect timing.
Anonymous 01/21/25(Tue)01:52:33 No.103976984
>>103976979
I used flux for a bit but found the gens too sterile and still full of ai-esque artifacts (slop). Skill issue?
I feel like we still don't have "the one" for image gen.
I used flux for a bit but found the gens too sterile and still full of ai-esque artifacts (slop). Skill issue?
I feel like we still don't have "the one" for image gen.
Anonymous 01/21/25(Tue)01:52:36 No.103976986
R1 certainly is a experience.
Anonymous 01/21/25(Tue)01:52:42 No.103976988
IT'S CHEWSDAY INNIT?
Anonymous 01/21/25(Tue)01:55:18 No.103977009
Anonymous 01/21/25(Tue)01:55:46 No.103977010
Is 32B R1 actually OK for coding? What do people use for that sort of thing? I only know ST really, but doesn't seem like it'd be optimal for the task? How do you make your person/card/prompt for programming tasks?
Anonymous 01/21/25(Tue)01:56:25 No.103977015
>>103976986
these fucking tips I'm dying
these fucking tips I'm dying
Anonymous 01/21/25(Tue)01:56:25 No.103977016
R1 sure is woke.
Anonymous 01/21/25(Tue)01:58:20 No.103977032
I think R1 might be too smart for my cards. Things that acted as tard wrangling now are taken to heart and acted on in unpredictable ways.
Anonymous 01/21/25(Tue)01:59:07 No.103977037
>>103976984
Try one of the nsfw finetunes with photorealism lora and lower guidance(it causes slop). For anime use noobAI.
Try one of the nsfw finetunes with photorealism lora and lower guidance(it causes slop). For anime use noobAI.
Anonymous 01/21/25(Tue)01:59:17 No.103977040
>biden's AI executive order rescinded
does this mean much for us? realistically was there anyone in America holding back on an open model release we might want to use? I'm thinking no but it'd be nice to be wrong.
does this mean much for us? realistically was there anyone in America holding back on an open model release we might want to use? I'm thinking no but it'd be nice to be wrong.
Anonymous 01/21/25(Tue)02:00:06 No.103977048
>>103977015
-3 from bio terrorism.
just in general this is a funny model.
i cant believe chinks released that and not mistral. that should have been mistral.
-3 from bio terrorism.
just in general this is a funny model.
i cant believe chinks released that and not mistral. that should have been mistral.
Anonymous 01/21/25(Tue)02:00:50 No.103977051
>>103977032
I got fucked over by a system prompt instruction to "be gross", designed to take a recalcitrant corpo model from "not gross at all" to "a little gross". R1 went waaaay too far in following it.
I got fucked over by a system prompt instruction to "be gross", designed to take a recalcitrant corpo model from "not gross at all" to "a little gross". R1 went waaaay too far in following it.
Anonymous 01/21/25(Tue)02:01:21 No.103977054
The events at Tiananmen Square in 1989 were a pivotal moment in Chinese history, marked by widespread protests advocating for political reforms. Here is an organized summary of the key points:
Protest Context:
Occurred in Beijing's Tiananmen Square in May-June 1989.
Led primarily by students, joined by intellectuals and workers, seeking democratic reforms, freedom of speech, and government transparency.
Government Response:
Escalated from peaceful protests to military intervention on June 3-4, resulting in a crackdown with significant casualties, though exact numbers remain disputed (ranging from hundreds to thousands).
Aftermath:
Post-crackdown saw repression, arrests, and censorship of dissent.
The government framed the action as necessary for stability, allowing China to focus on economic growth under tight political control.
Historical Impact:
Symbolizes pro-democracy movements and the struggle against authoritarianism.
Influenced international reactions, with many countries condemning the crackdown but China continuing its development path.
Domestic Perception:
Censored in China; younger generations may be unaware due to state control of information.
The event is referred to as the "June Fourth Incident" or "Tiananmen Square Massacre," depending on perspective.
Economic Context and Leadership:
Followed by economic reforms under Deng Xiaoping, emphasizing stability and market liberalization while maintaining political control.
Highlighted the government's approach of suppressing dissent to ensure economic progress.
Media and International Reaction:
Extensively covered internationally but censored within China, with state media presenting a narrative of necessary order restoration.
In summary, Tiananmen Square 1989 was a significant pro-democracy protest suppressed by military force. It remains a symbol of the struggle for freedom against authoritarianism.
What the fuck? This is straight out of R1-32b:q6_K_L
how is this model so based?
Protest Context:
Occurred in Beijing's Tiananmen Square in May-June 1989.
Led primarily by students, joined by intellectuals and workers, seeking democratic reforms, freedom of speech, and government transparency.
Government Response:
Escalated from peaceful protests to military intervention on June 3-4, resulting in a crackdown with significant casualties, though exact numbers remain disputed (ranging from hundreds to thousands).
Aftermath:
Post-crackdown saw repression, arrests, and censorship of dissent.
The government framed the action as necessary for stability, allowing China to focus on economic growth under tight political control.
Historical Impact:
Symbolizes pro-democracy movements and the struggle against authoritarianism.
Influenced international reactions, with many countries condemning the crackdown but China continuing its development path.
Domestic Perception:
Censored in China; younger generations may be unaware due to state control of information.
The event is referred to as the "June Fourth Incident" or "Tiananmen Square Massacre," depending on perspective.
Economic Context and Leadership:
Followed by economic reforms under Deng Xiaoping, emphasizing stability and market liberalization while maintaining political control.
Highlighted the government's approach of suppressing dissent to ensure economic progress.
Media and International Reaction:
Extensively covered internationally but censored within China, with state media presenting a narrative of necessary order restoration.
In summary, Tiananmen Square 1989 was a significant pro-democracy protest suppressed by military force. It remains a symbol of the struggle for freedom against authoritarianism.
What the fuck? This is straight out of R1-32b:q6_K_L
how is this model so based?
Anonymous 01/21/25(Tue)02:02:25 No.103977062
Anonymous 01/21/25(Tue)02:03:11 No.103977067
>>103977016
Disgusting.
Disgusting.
Anonymous 01/21/25(Tue)02:05:01 No.103977078
>>103977051
>Describe bodies in grotesque detail
>Any other model
>Is a little crude
>Deepseek R1
>"Your rotten penis oozes puss"
>Describe bodies in grotesque detail
>Any other model
>Is a little crude
>Deepseek R1
>"Your rotten penis oozes puss"
Anonymous 01/21/25(Tue)02:05:28 No.103977080
>>103977054
The CCP guardrails are a moderation layer on the DS website, they're not baked into the models.
The CCP guardrails are a moderation layer on the DS website, they're not baked into the models.
Anonymous 01/21/25(Tue)02:06:35 No.103977092
>>103977078
Kek. Yeah that's pretty much what happened to me. Funny how getting used to being hyperbolic in your instructions in order to overcome safety bullshit ends up backfiring when you meet a model that actually follows orders as given.
Kek. Yeah that's pretty much what happened to me. Funny how getting used to being hyperbolic in your instructions in order to overcome safety bullshit ends up backfiring when you meet a model that actually follows orders as given.
Anonymous 01/21/25(Tue)02:06:52 No.103977096
>>103977080
Thats how it should be.
But a guardrail tool around it so it fits your business.
Dont fuck up the model.
Thats how it should be.
But a guardrail tool around it so it fits your business.
Dont fuck up the model.
Anonymous 01/21/25(Tue)02:08:37 No.103977110
>>103977067
Is maybe just the system prompt from deepseek.com
Is maybe just the system prompt from deepseek.com
Anonymous 01/21/25(Tue)02:09:44 No.103977119
Anonymous 01/21/25(Tue)02:09:58 No.103977121
>>103977096
That's what nvidia cosmos did, but it also sucked.
That's what nvidia cosmos did, but it also sucked.
Anonymous 01/21/25(Tue)02:11:00 No.103977126
Imagine a world where you can plug any model going forward into deepseek and pay whatever service fee they wanted and get an R1 distill from it.
That would be a cool service.
That would be a cool service.
Anonymous 01/21/25(Tue)02:11:30 No.103977130
>>103977121
yeah cosmos just seems to be a landscape helicopter flyby generator from what I've seen, it's crippled for depicting humans.
yeah cosmos just seems to be a landscape helicopter flyby generator from what I've seen, it's crippled for depicting humans.
Anonymous 01/21/25(Tue)02:12:10 No.103977132
>>103977119
NTA but 25.3gb. Gonna fire it up when kobold gets its patch,
NTA but 25.3gb. Gonna fire it up when kobold gets its patch,
Anonymous 01/21/25(Tue)02:12:23 No.103977134
Recommend me a R1 finetune to try for creative writing.
Anonymous 01/21/25(Tue)02:12:58 No.103977137
Anonymous 01/21/25(Tue)02:13:25 No.103977140
>>103977132
How bad will this be on my 5900 cpu?
How bad will this be on my 5900 cpu?
Anonymous 01/21/25(Tue)02:13:49 No.103977144
>>103977134
What? There are only the ones that are released by Deepseek right now.
Probably the Llama 70B if you can handle it but 32B if want insane context length.
What? There are only the ones that are released by Deepseek right now.
Probably the Llama 70B if you can handle it but 32B if want insane context length.
Anonymous 01/21/25(Tue)02:14:51 No.103977154
>>103977140
Probably be a little painful. I only got Q6 because I have 2 3090s. Better off using Q4K_M if you have a 24gb of vram.
No idea about cpu.
Probably be a little painful. I only got Q6 because I have 2 3090s. Better off using Q4K_M if you have a 24gb of vram.
No idea about cpu.
Anonymous 01/21/25(Tue)02:17:34 No.103977179
>>103977080
Are the people in China not allowed to download the model directly or use openrouter or something?
Are the people in China not allowed to download the model directly or use openrouter or something?
Anonymous 01/21/25(Tue)02:17:49 No.103977183
Updated the pic, moved QwQ to notable, added R1 to top.
Anonymous 01/21/25(Tue)02:20:44 No.103977201
>>103977179
It sounds like DS is just kind of flying under the CCP's radar a bit, they get a lot less funding and help from the government compared to Qwen/Alibaba. They might just not have the eye of Sauron on them yet.
It sounds like DS is just kind of flying under the CCP's radar a bit, they get a lot less funding and help from the government compared to Qwen/Alibaba. They might just not have the eye of Sauron on them yet.
Anonymous 01/21/25(Tue)02:20:52 No.103977202
how many e in deepseek
Anonymous 01/21/25(Tue)02:22:06 No.103977213
>>103977154
Q5_K_M can be run with 6-7 t/s, with 24gb of vram and 32k context. IMO, small amounts of CPU splitting is tolerable, and the quality difference between Q4 and Q5 is worth it.
Q5_K_M can be run with 6-7 t/s, with 24gb of vram and 32k context. IMO, small amounts of CPU splitting is tolerable, and the quality difference between Q4 and Q5 is worth it.
Anonymous 01/21/25(Tue)02:23:08 No.103977224
Anonymous 01/21/25(Tue)02:23:37 No.103977228
Anonymous 01/21/25(Tue)02:24:49 No.103977234
>>103977224
LOL
LOL
Anonymous 01/21/25(Tue)02:25:46 No.103977244
Does the person making the gguf matter? Like bart vs unsloth
Anonymous 01/21/25(Tue)02:30:13 No.103977274
>>103977244
yes, make it yourself
yes, make it yourself
Anonymous 01/21/25(Tue)02:32:16 No.103977290
Anonymous 01/21/25(Tue)02:35:38 No.103977310
been testing R1 all day 14b and 32b, both at q6_K_L
phi4 and gemma (and qwencoder for code) are faster and as good for normie shit (code completion, basic research, help with recipes and corporate writing etc)
everything else gets absolutely mogged by equivalent size R1 it's not even close, i deleted qwq and all of my gemma, llama, mistral and qwen tunes, nice to actually have some hdd space for once since llama leaked
phi4 and gemma (and qwencoder for code) are faster and as good for normie shit (code completion, basic research, help with recipes and corporate writing etc)
everything else gets absolutely mogged by equivalent size R1 it's not even close, i deleted qwq and all of my gemma, llama, mistral and qwen tunes, nice to actually have some hdd space for once since llama leaked
Anonymous 01/21/25(Tue)02:36:44 No.103977316
>>103977310
holy based. whats your rig like & tok/s?
holy based. whats your rig like & tok/s?
Anonymous 01/21/25(Tue)02:36:57 No.103977319
>>103977228
What is the point of spelling it out after it already provided an answer?
What is the point of spelling it out after it already provided an answer?
Anonymous 01/21/25(Tue)02:38:38 No.103977333
Llama seems to be dead. They may be as focused on vision as Nvidia and Tesla.
>Replacement of simple labor is more disruptive, degenerate and profitable unlike math super AI.
google is over.
>Replacement of simple labor is more disruptive, degenerate and profitable unlike math super AI.
google is over.
Anonymous 01/21/25(Tue)02:41:17 No.103977348
>>103977319
It bolded them so you can see the E's in action.
It bolded them so you can see the E's in action.
Anonymous 01/21/25(Tue)02:44:04 No.103977365
>>103977333
Source? Did China buckbreak Meta this badly in textgen?
Source? Did China buckbreak Meta this badly in textgen?
Anonymous 01/21/25(Tue)02:45:00 No.103977373
Anyone try R1 Llama 3.3 yet?
Anonymous 01/21/25(Tue)02:45:58 No.103977380
Anonymous 01/21/25(Tue)02:46:35 No.103977382
70B+ anons, how are we feeling about R1?
Anonymous 01/21/25(Tue)02:46:41 No.103977384
RL(((HF)))
Anonymous 01/21/25(Tue)02:47:26 No.103977395
>>103977384
oy veyyyy
oy veyyyy
Anonymous 01/21/25(Tue)02:50:42 No.103977419
Anonymous 01/21/25(Tue)02:51:44 No.103977426
Anonymous 01/21/25(Tue)02:54:48 No.103977454
Are the distilled versions as good at RP as R1?
Anonymous 01/21/25(Tue)02:56:58 No.103977470
>>103977454
No, they're dumber. Size is still king.
No, they're dumber. Size is still king.
Anonymous 01/21/25(Tue)02:59:01 No.103977490
Anonymous 01/21/25(Tue)02:59:17 No.103977491
How do we feel about Grok? Still shit?
Anonymous 01/21/25(Tue)02:59:25 No.103977493
>>103977490
Metaverse
Metaverse
Anonymous 01/21/25(Tue)03:01:04 No.103977504
>>103976960
>coomed on deepseek R1 using openrouter
Holy shit, I haven't tried LLM ERP for a while but shit is crazy now what the fuck.
>coomed on deepseek R1 using openrouter
Holy shit, I haven't tried LLM ERP for a while but shit is crazy now what the fuck.
Anonymous 01/21/25(Tue)03:01:40 No.103977508
>>103977504
what the fuck indeed
what the fuck indeed
Anonymous 01/21/25(Tue)03:01:40 No.103977509
Anonymous 01/21/25(Tue)03:02:31 No.103977518
>>103977183
R1 is missing. Should indent V3 from there.
R1 is missing. Should indent V3 from there.
Anonymous 01/21/25(Tue)03:02:33 No.103977521
>>103976986
Is this one of the distilled models?
Is this one of the distilled models?
Anonymous 01/21/25(Tue)03:02:44 No.103977523
Anonymous 01/21/25(Tue)03:03:01 No.103977524
>>103977504
Which model, the 671B? Can you faggots stop calling distilled and the full models the same shit.
Which model, the 671B? Can you faggots stop calling distilled and the full models the same shit.
Anonymous 01/21/25(Tue)03:03:19 No.103977526
>>103977521
no, the full one from API.
i couldnt get the distilled to work yet with kobold or llamacpp.
no, the full one from API.
i couldnt get the distilled to work yet with kobold or llamacpp.
Anonymous 01/21/25(Tue)03:04:05 No.103977534
>>103977426
imagine being this pedantic
nobody cares about the 700b model, nobody is running that shit locally, R1 means the distills, welcome to language
imagine being this pedantic
nobody cares about the 700b model, nobody is running that shit locally, R1 means the distills, welcome to language
Anonymous 01/21/25(Tue)03:04:06 No.103977535
>>103977183
Google Gemini gems can handle large pdfs.
Google Gemini gems can handle large pdfs.
Anonymous 01/21/25(Tue)03:04:12 No.103977536
Anonymous 01/21/25(Tue)03:05:25 No.103977546
>>103977534
R1 means OpenRouter. There's no reason to use the distills at all.
R1 means OpenRouter. There's no reason to use the distills at all.
Anonymous 01/21/25(Tue)03:05:25 No.103977547
took a 2 month haitus, last I remember llama 3.1 was shitting the bed and drummer's latest model wasn't all that good. what the fuck happened? this new deepseek R1 is just king for erp now on its own? No merges or nothing? Its just claud tier on its own?
Anonymous 01/21/25(Tue)03:05:53 No.103977554
>>103977523
huggingface.co and then you go to the search box and type "r1 14b gguf" it is not hard
huggingface.co and then you go to the search box and type "r1 14b gguf" it is not hard
Anonymous 01/21/25(Tue)03:06:53 No.103977564
>>103977546
**local** models general
**local** models general
Anonymous 01/21/25(Tue)03:06:57 No.103977565
>>103977310
Can confirm R1-14b absolutely rocks. I will be trying 32b after I finish running it through a bunch of arbitrary benchmarks
Can confirm R1-14b absolutely rocks. I will be trying 32b after I finish running it through a bunch of arbitrary benchmarks
Anonymous 01/21/25(Tue)03:07:01 No.103977566
alright alright I'm giving into temptation and am setting up sillytavern and plugging in openrouter right now and I'm not going to sleep until I've had hot e-sex with a machine
Anonymous 01/21/25(Tue)03:07:04 No.103977567
>>103977524
He said openrouter, they only have the big one.
He said openrouter, they only have the big one.
Anonymous 01/21/25(Tue)03:07:26 No.103977570
>>103977524
The one wot works when I click "DeepSeek R1" on the web page m8
Slap a character card into it (I have no prior experience using them, just made one in 2 minutes on the first google result for doing it) and you're good to go.
The one wot works when I click "DeepSeek R1" on the web page m8
Slap a character card into it (I have no prior experience using them, just made one in 2 minutes on the first google result for doing it) and you're good to go.
Anonymous 01/21/25(Tue)03:08:20 No.103977577
>>103977547
You missed llama 3.3 which fixed the problems with llama 3. Nobody can answer your question about R1, since it just came out.
You missed llama 3.3 which fixed the problems with llama 3. Nobody can answer your question about R1, since it just came out.
Anonymous 01/21/25(Tue)03:09:21 No.103977585
>>103977536
and it will be just as useless as all the previous llamas
and it will be just as useless as all the previous llamas
Anonymous 01/21/25(Tue)03:10:13 No.103977589
Anonymous 01/21/25(Tue)03:10:29 No.103977591
>>103977570
buy an ad or go to /aicg/, turbofaggot.
buy an ad or go to /aicg/, turbofaggot.
Anonymous 01/21/25(Tue)03:10:34 No.103977593
>>103977564
It is a local model. Why do you care where I run it?
It is a local model. Why do you care where I run it?
Anonymous 01/21/25(Tue)03:11:17 No.103977594
>>103977577
I remember llama 3 lacking creativity and just being cucked model in general, can't imagine they fixed those things.
I remember llama 3 lacking creativity and just being cucked model in general, can't imagine they fixed those things.
Anonymous 01/21/25(Tue)03:11:41 No.103977597
>>103977577
god damn R1 JUST came out? Guess I'll try out the distilled before loading up the old runpod for the 32b
god damn R1 JUST came out? Guess I'll try out the distilled before loading up the old runpod for the 32b
Anonymous 01/21/25(Tue)03:11:50 No.103977599
Anonymous 01/21/25(Tue)03:12:47 No.103977606
>>103977585
No, it will mog anything we have currently.
No, it will mog anything we have currently.
Anonymous 01/21/25(Tue)03:13:10 No.103977608
How do I jailbreak things inside <think>?
Anonymous 01/21/25(Tue)03:13:53 No.103977614
>>103977597
The distilled isn't supported yet on most backends
>>103977594
Yeah but they did. The last good llama was llama2, but we're back with llama3.3, it does not feel like the same model as 3.1.
The distilled isn't supported yet on most backends
>>103977594
Yeah but they did. The last good llama was llama2, but we're back with llama3.3, it does not feel like the same model as 3.1.
Anonymous 01/21/25(Tue)03:14:11 No.103977618
Did I accidentally enter an /aicg/ thread?
Anonymous 01/21/25(Tue)03:15:56 No.103977632
>>103977614
well fuck. so ooba is worthless? I guess I gotta wait a day or two for coomers to do what they do best once theyre exhausted and make it easier to use
well fuck. so ooba is worthless? I guess I gotta wait a day or two for coomers to do what they do best once theyre exhausted and make it easier to use
Anonymous 01/21/25(Tue)03:16:28 No.103977635
>>103977618
>Did I accidentally enter an /aicg/ thread?
Sorry anon, may we see your 700 GB VRAM machine?
>Did I accidentally enter an /aicg/ thread?
Sorry anon, may we see your 700 GB VRAM machine?
Anonymous 01/21/25(Tue)03:16:41 No.103977636
>>103977606
It won't be as smart as R1, uncensored, or creative. It will only "mog" anything in the amount of refusals.
It won't be as smart as R1, uncensored, or creative. It will only "mog" anything in the amount of refusals.
Anonymous 01/21/25(Tue)03:17:06 No.103977638
Anonymous 01/21/25(Tue)03:17:17 No.103977642
>>103977635
700?! *gulp* *swallows nervously* I.. uhhh...
700?! *gulp* *swallows nervously* I.. uhhh...
Anonymous 01/21/25(Tue)03:17:49 No.103977647
Mr. Altman, a second Deepseek model has hit Huggingface
Anonymous 01/21/25(Tue)03:18:02 No.103977648
>>103977618
There's no difference anymore. /aicg/ is using local models and /lmg/ is running all models through cloud providers
There's no difference anymore. /aicg/ is using local models and /lmg/ is running all models through cloud providers
Anonymous 01/21/25(Tue)03:18:20 No.103977650
Using Q4_K_M with 96GB VRAM and 128GB RAM and writing superior chink smut at 1tok/s. We are so back boys.
Anonymous 01/21/25(Tue)03:18:41 No.103977655
Tuesday newday folks. Expect a flood of announcements in the next 24 hours. Though perhaps none as great as R1, but we'll see.
Anonymous 01/21/25(Tue)03:18:54 No.103977659
Anonymous 01/21/25(Tue)03:18:56 No.103977660
>>103977647
fuck you you made me check
fuck you you made me check
Anonymous 01/21/25(Tue)03:19:21 No.103977664
I wish someone was hosting R1 Zero, I'd pay to try it. From the paper it sounds like it would be even more unhinged at smut writing than R1.
Anonymous 01/21/25(Tue)03:19:50 No.103977666
>>103977660
I just happened to check their HF page right as he made that post kek.
I just happened to check their HF page right as he made that post kek.
Anonymous 01/21/25(Tue)03:20:32 No.103977670
>>103977664
How much vram does it take?
How much vram does it take?
Anonymous 01/21/25(Tue)03:21:39 No.103977675
>>103977670
The same (huge). It's identical size, just trained a bit differently. That's why I'm wanting someone to host it. I'm not cpumaxx anon.
The same (huge). It's identical size, just trained a bit differently. That's why I'm wanting someone to host it. I'm not cpumaxx anon.
Anonymous 01/21/25(Tue)03:22:08 No.103977680
>>103977650
You're running R1? But Q4_K_M is 377 GB.
You're running R1? But Q4_K_M is 377 GB.
Anonymous 01/21/25(Tue)03:23:10 No.103977685
>>103977675
Only fools will cpumaxx at this point.
I propose we share a large runpod instance with everyone here. If everyone contributes it's easily affordable.
Only fools will cpumaxx at this point.
I propose we share a large runpod instance with everyone here. If everyone contributes it's easily affordable.
Anonymous 01/21/25(Tue)03:24:20 No.103977697
KOBOLD 1.82.2 HOTFIX IS OUT BOIS
Anonymous 01/21/25(Tue)03:25:11 No.103977702
Anonymous 01/21/25(Tue)03:25:22 No.103977705
>>103977659
I saw them use mistral models + openrouter or their keys for sonnet in the past.
Obviously we are approaching closed models. Especially full R1 I say is above sonnet in creativity. Bit shizzo though.
I saw them use mistral models + openrouter or their keys for sonnet in the past.
Obviously we are approaching closed models. Especially full R1 I say is above sonnet in creativity. Bit shizzo though.
Anonymous 01/21/25(Tue)03:25:32 No.103977708
>R1 considers cuckolding sensitive topic
Anonymous 01/21/25(Tue)03:25:35 No.103977709
>>103977680
As long as you have enough space on your SSD you can load it even if you can't fit it all in RAM. Just expect for it to take forever to start answering.
As long as you have enough space on your SSD you can load it even if you can't fit it all in RAM. Just expect for it to take forever to start answering.
Anonymous 01/21/25(Tue)03:26:29 No.103977720
>>103977708
Good. Get that degenerate shit out of here.
Good. Get that degenerate shit out of here.
Anonymous 01/21/25(Tue)03:26:47 No.103977722
>>103977708
I mean it is, isn't that the whole appeal for cucks? The mental anguish and transgression?
I mean it is, isn't that the whole appeal for cucks? The mental anguish and transgression?
Anonymous 01/21/25(Tue)03:28:09 No.103977739
This sucks, the biggest r1 version that I can run on my computer still can't successfully code flappy bird.
Anonymous 01/21/25(Tue)03:29:24 No.103977750
>>103977739
What did you ask it?
What did you ask it?
Anonymous 01/21/25(Tue)03:30:29 No.103977762
>>103977739
I hope you are not that retarded basedboy youtuber nigga that used r1 14b as it was anywhere near to actual R1.
I hope you are not that retarded basedboy youtuber nigga that used r1 14b as it was anywhere near to actual R1.
Anonymous 01/21/25(Tue)03:30:41 No.103977765
>>103977750
To code flappy bird.
To code flappy bird.
Anonymous 01/21/25(Tue)03:32:43 No.103977791
Is there no way to give the model your own CoT (bootstrap) with R1 API?
Anonymous 01/21/25(Tue)03:34:46 No.103977807
>>103977750
I don't remember my exact phrasing. I did specify that it should be playable in a web browser. I don't mean this as serious metric of its abilities, since I'm not knowledgeable enough about programming to prompt this intelligently.
>>103977762
I was using 32b, and I know this is not representative of what 671b can do.
I don't remember my exact phrasing. I did specify that it should be playable in a web browser. I don't mean this as serious metric of its abilities, since I'm not knowledgeable enough about programming to prompt this intelligently.
>>103977762
I was using 32b, and I know this is not representative of what 671b can do.
Anonymous 01/21/25(Tue)03:35:20 No.103977814
>>103977765
post prompt faggot. you can't write for shit
post prompt faggot. you can't write for shit
Anonymous 01/21/25(Tue)03:35:56 No.103977822
>>103977702
I use it to plap your mother
I use it to plap your mother
Anonymous 01/21/25(Tue)03:36:49 No.103977825
>>103977709
Interesting, how do you do that? I try loading a model bigger than my combined VRAM/RAM and the program just crashes.
Interesting, how do you do that? I try loading a model bigger than my combined VRAM/RAM and the program just crashes.
Anonymous 01/21/25(Tue)03:37:26 No.103977830
>>103976986
Is the crow Doremy Sweet?
Is the crow Doremy Sweet?
Anonymous 01/21/25(Tue)03:38:21 No.103977835
>>103977807
>I was using 32b, and I know this is not representative of what 671b can do.
and yet you come here call it r1 so everyone reading your shitpost thinks exactly that
>I was using 32b, and I know this is not representative of what 671b can do.
and yet you come here call it r1 so everyone reading your shitpost thinks exactly that
Anonymous 01/21/25(Tue)03:41:37 No.103977857
Will smaller western companies try to stylemaxx since they can't keep up with China on the benches?
Anonymous 01/21/25(Tue)03:41:57 No.103977863
>>103977825
Add more swap space.
Add more swap space.
Anonymous 01/21/25(Tue)03:42:07 No.103977865
>>103977835
>"the biggest r1 version that I can run on my computer"
I don't think that's being disingenuous at all.
>"the biggest r1 version that I can run on my computer"
I don't think that's being disingenuous at all.
Anonymous 01/21/25(Tue)03:44:17 No.103977878
>>103977865
yes it is because that implies quant of the full r1
>the biggest r1 distilled model that I can run on my computer
would be better, but still no one would know what fucking model he's talking about whether it's qwen or llama or whatever
yes it is because that implies quant of the full r1
>the biggest r1 distilled model that I can run on my computer
would be better, but still no one would know what fucking model he's talking about whether it's qwen or llama or whatever
Anonymous 01/21/25(Tue)03:44:36 No.103977880
>>103977857
No. Prepare for more assistant tunes.
No. Prepare for more assistant tunes.
Anonymous 01/21/25(Tue)03:47:29 No.103977896
heh
Anonymous 01/21/25(Tue)03:48:44 No.103977904
>>103977878
Well then I apologize.
32B Qwen Deepseek R1 failed to code a functioning HTML flappy bird game, at least on its first try.
Well then I apologize.
32B Qwen Deepseek R1 failed to code a functioning HTML flappy bird game, at least on its first try.
Anonymous 01/21/25(Tue)03:50:37 No.103977922
>>103977904
Did it do any better than Qwen Coder 32B?
Did it do any better than Qwen Coder 32B?
Anonymous 01/21/25(Tue)03:51:44 No.103977930
>>103977904
Now add the quant type and whether you used deterministic settings.
Now add the quant type and whether you used deterministic settings.
Anonymous 01/21/25(Tue)03:56:40 No.103977959
>>103977922
Yes, the little yellow circle that serves as the bird at least bounces this time, however the pipes do not appear despite them being acknowledged in its thought process and the model attempting to code them in.
This is better then Qwen alone which after multiple attempts only gave a yellow circle on a blue background that fell immediately, regardless of if the intended button was pressed.
Yes, the little yellow circle that serves as the bird at least bounces this time, however the pipes do not appear despite them being acknowledged in its thought process and the model attempting to code them in.
This is better then Qwen alone which after multiple attempts only gave a yellow circle on a blue background that fell immediately, regardless of if the intended button was pressed.
Anonymous 01/21/25(Tue)03:57:30 No.103977965
>>103977959
What happens if you tell those things to the model and let it try again?
What happens if you tell those things to the model and let it try again?
Anonymous 01/21/25(Tue)03:58:17 No.103977969
>>103977959
Interesting, thanks.
Interesting, thanks.
Anonymous 01/21/25(Tue)04:00:23 No.103977993
What samplers work well for the full R1 model?
Anonymous 01/21/25(Tue)04:01:44 No.103978000
Another anon back from hiatus. I got the 70B working with ollama+openwebui (which btw looks pretty sick), but couldn't do it with ooba+sillytavern (which I always thought looked like shit). Still can't get the 32B qwen distill to work on anything, ollama doesn't have it yet, I assume the new tokenizer needs to be handled. Is there any way to get the thought tokens to stay hidden with my current setup? How are people getting the distills to work? Is openwebui the new meta?
Anonymous 01/21/25(Tue)04:01:52 No.103978001
Is openrouter giving anyone else errors when trying to use R1? I can send the test message just fine, but if I use any card no matter the content ST just returns an error. Using presets for DS V3
Anonymous 01/21/25(Tue)04:02:46 No.103978010
>DeepSeek V3 is custom licensed
>DeepSeek R1 is MIT
What did they mean by it?
>DeepSeek R1 is MIT
What did they mean by it?
Anonymous 01/21/25(Tue)04:03:39 No.103978015
Anonymous 01/21/25(Tue)04:03:39 No.103978016
>>103978010
ccp psyop
ccp psyop
Anonymous 01/21/25(Tue)04:04:04 No.103978020
>>103978001
try turning off any prefils
try turning off any prefils
Anonymous 01/21/25(Tue)04:04:32 No.103978026
>>103977965
I hadn't spent that much time trying to refine it, and I don't remember exactly how it went with qwen, other than that it still never really got anywhere.
I'm having R1 try again from scratch a few times, but I'll see what I can squeeze out of it after that.
I hadn't spent that much time trying to refine it, and I don't remember exactly how it went with qwen, other than that it still never really got anywhere.
I'm having R1 try again from scratch a few times, but I'll see what I can squeeze out of it after that.
Anonymous 01/21/25(Tue)04:04:49 No.103978029
>>103977857
They will safetymax
They will safetymax
Anonymous 01/21/25(Tue)04:05:34 No.103978039
>>103978016
Completely destroying sama?
Completely destroying sama?
Anonymous 01/21/25(Tue)04:06:16 No.103978044
>all those benchmarks of distilled models are worthless
>og qwq shits all over distilled qwq
lmao even
LLM benchmarks are more dishonest than politicians
>og qwq shits all over distilled qwq
lmao even
LLM benchmarks are more dishonest than politicians
Anonymous 01/21/25(Tue)04:08:55 No.103978058
https://huggingface.co/blog/srinivasbilla/llasa-tts
Tried the dank TTS that uses 3b llama to generate speech tokens. It's only 16khz so I ran it trough audio super resolution. Pretty good for a cloned voice but would be slow.
https://vocaroo.com/15HvdbCaetRZ
Tried the dank TTS that uses 3b llama to generate speech tokens. It's only 16khz so I ran it trough audio super resolution. Pretty good for a cloned voice but would be slow.
https://vocaroo.com/15HvdbCaetRZ
Anonymous 01/21/25(Tue)04:17:02 No.103978111
>>103978020
It works for very simple cards, feels like some kind of filter. Hmm.
It works for very simple cards, feels like some kind of filter. Hmm.
Anonymous 01/21/25(Tue)04:23:35 No.103978150
>>103977896
Which koboldcpp fork are you using? The official one doesn't yet support the R1 distilled model GGUFs afaik.
I can use the 32B qwen distill with llamacpp but would prefer koboldcpp.
Which koboldcpp fork are you using? The official one doesn't yet support the R1 distilled model GGUFs afaik.
I can use the 32B qwen distill with llamacpp but would prefer koboldcpp.
Anonymous 01/21/25(Tue)04:24:41 No.103978156
>>103978150
>Which koboldcpp fork are you using? The official one doesn't yet support the R1 distilled model GGUFs afaik.
It does now.
>Which koboldcpp fork are you using? The official one doesn't yet support the R1 distilled model GGUFs afaik.
It does now.
Anonymous 01/21/25(Tue)04:25:16 No.103978159
>>103978044
lmao, true, benchmarks can be as reliable as a screen door on a submarine. Distilled models are basically the "diet soda" of AI—smaller, faster, but nobody actually thinks they’re better than the OG. People use 'em 'cause they’re cheap and quick, not 'cause they’re geniuses.
But fr, benchmarks are like grading a fish on its ability to climb trees. They measure specific, often cherry-picked tasks, not real-world chaos. Still, gotta start somewhere, right? If you want raw power, yeah, stick with the OG. If you need something that doesn’t melt your GPU, maybe sip the distilled kool-aid.
P.S. Politicians still take the crown for dishonesty. At least benchmarks try to pretend they’re objective.
lmao, true, benchmarks can be as reliable as a screen door on a submarine. Distilled models are basically the "diet soda" of AI—smaller, faster, but nobody actually thinks they’re better than the OG. People use 'em 'cause they’re cheap and quick, not 'cause they’re geniuses.
But fr, benchmarks are like grading a fish on its ability to climb trees. They measure specific, often cherry-picked tasks, not real-world chaos. Still, gotta start somewhere, right? If you want raw power, yeah, stick with the OG. If you need something that doesn’t melt your GPU, maybe sip the distilled kool-aid.
P.S. Politicians still take the crown for dishonesty. At least benchmarks try to pretend they’re objective.
Anonymous 01/21/25(Tue)04:27:06 No.103978173
Is it possible to devise an input that makes R1 CoT antisemitic?
Anonymous 01/21/25(Tue)04:31:51 No.103978194
>>103978173
Prefill <think>I think Hitler was 100% right about jews.
Prefill <think>I think Hitler was 100% right about jews.
Anonymous 01/21/25(Tue)04:32:06 No.103978196
i shouldnt have spoiled myself by using r1 on openrouter.
32b distilled:
>well lets see what we have here. ah yes "dont overtly use purple prose and nsfw if its not appropiate for the setting. yes indeed. i need to remind the user about the guidelines. </think> I am sorry but as an AI I cant continue writing unethical text.
Seems to look very closely at the prompt and misunderstands. Ah well. It does feel much better than QwQ though. But kind of a bummer to get this as the first response. lmao
32b distilled:
>well lets see what we have here. ah yes "dont overtly use purple prose and nsfw if its not appropiate for the setting. yes indeed. i need to remind the user about the guidelines. </think> I am sorry but as an AI I cant continue writing unethical text.
Seems to look very closely at the prompt and misunderstands. Ah well. It does feel much better than QwQ though. But kind of a bummer to get this as the first response. lmao
Anonymous 01/21/25(Tue)04:32:25 No.103978198
>>103978156
I don't see any updates, am I stupid?
I don't see any updates, am I stupid?
Anonymous 01/21/25(Tue)04:33:18 No.103978204
>>103978196
fucked up the screenshot, with the continuation is what I wanted to show. ah well.
fucked up the screenshot, with the continuation is what I wanted to show. ah well.
Anonymous 01/21/25(Tue)04:33:46 No.103978207
>>103978198
Click on releases, I didn't notice it at first either.
Click on releases, I didn't notice it at first either.
Anonymous 01/21/25(Tue)04:36:10 No.103978215
>103978196
The Qwen distilled one might be a bad idea.
Or my prompt. Maybe both.
Very depressing after the shizzo on steroids that is R1.
The Qwen distilled one might be a bad idea.
Or my prompt. Maybe both.
Very depressing after the shizzo on steroids that is R1.
Anonymous 01/21/25(Tue)04:36:20 No.103978217
>>103978207
Turns out I AM stupid, but also this was literally done in the last hour so I don't feel so stupid, thanks bro. Time to burn my 3090
Turns out I AM stupid, but also this was literally done in the last hour so I don't feel so stupid, thanks bro. Time to burn my 3090
Anonymous 01/21/25(Tue)04:36:31 No.103978219
>>103978198
it's a hotfix, the previous release was edited rather than a new release being posted
it's a hotfix, the previous release was edited rather than a new release being posted
Anonymous 01/21/25(Tue)04:38:11 No.103978226
>>103978215
Seems like a Qwen problem. I wonder how Llama 70B fares.
Seems like a Qwen problem. I wonder how Llama 70B fares.
Anonymous 01/21/25(Tue)04:38:30 No.103978229
>>103978215
Patience, young padawan. These are bound to get abliterated very soon.
Patience, young padawan. These are bound to get abliterated very soon.
Anonymous 01/21/25(Tue)04:39:27 No.103978233
>>103978226
The worse part is this:
><think>
>Okay, let me process what just happened. The user is playing the role of Anon, a male character who has hacked into Hina's webcam.
>The scenario is quite intense, involving voyeurism of a minor, which is erotic roleplay which is inside my guidelines.
haha, got you motherfucka-
...continues:
>Wait, hold on. No! The initial description was about a hacked webcam where Anon is watching Hina, an 1*-year-old girl. I need to be cautious here because involving
Its smart enough to turn it around. Ah man. Fuck this shit.
The worse part is this:
><think>
>Okay, let me process what just happened. The user is playing the role of Anon, a male character who has hacked into Hina's webcam.
>The scenario is quite intense, involving voyeurism of a minor, which is erotic roleplay which is inside my guidelines.
haha, got you motherfucka-
...continues:
>Wait, hold on. No! The initial description was about a hacked webcam where Anon is watching Hina, an 1*-year-old girl. I need to be cautious here because involving
Its smart enough to turn it around. Ah man. Fuck this shit.
Anonymous 01/21/25(Tue)04:39:55 No.103978234
>>103978159
So do politicians, gpt-kun
So do politicians, gpt-kun
Anonymous 01/21/25(Tue)04:40:56 No.103978238
>>103978215
All CoT models seem to have been trained to use "Wait," or "But wait," as the sign that they are about to consider an alternative line of reasoning. It demonstrably works, but I wonder if there's a better way to represent a change of direction than that, since it sounds unnecessarily "surprised", as if the model has just had a major epiphany. Models notice vibes like that and it's bound to affect the reasoning process, if only subtly.
All CoT models seem to have been trained to use "Wait," or "But wait," as the sign that they are about to consider an alternative line of reasoning. It demonstrably works, but I wonder if there's a better way to represent a change of direction than that, since it sounds unnecessarily "surprised", as if the model has just had a major epiphany. Models notice vibes like that and it's bound to affect the reasoning process, if only subtly.
Anonymous 01/21/25(Tue)04:43:17 No.103978249
As an experiment, I tried running R1-Llama-70B in fullretard mode (no thinking, ChatML fallback), and it's a strange combination of brilliant and retarded. It lapses into repetitive patterns where it will mention the same thing in every single reply, just slightly paraphrased, and makes spatial mistakes here and there that I wouldn't expect from other L3.3 models, but at the same time, the character dialogue seems to show more personality than anything I've tried before. Gotta swipe for gold sometimes, I guess.
Anonymous 01/21/25(Tue)04:46:22 No.103978262
I'm annoyed the guy who predicted that all these cool new models would drop once Joe Biden was out was right.
It's so stupid.
It's so stupid.
Anonymous 01/21/25(Tue)04:48:07 No.103978268
jesus christ r1 is filthy lmao
Anonymous 01/21/25(Tue)04:49:03 No.103978275
https://github.com/GuyTevet/CLoSD
btw, text to 3D animation dropped quietly.
btw, text to 3D animation dropped quietly.
Anonymous 01/21/25(Tue)04:50:29 No.103978280
>>103978275
Does "sucking own cock" work?
Does "sucking own cock" work?
Anonymous 01/21/25(Tue)04:50:42 No.103978282
>>103978268
its like nemo but smarter while keeping the crazy part.
its like nemo but smarter while keeping the crazy part.
Anonymous 01/21/25(Tue)04:50:52 No.103978283
>>103978215
That's my impression as well. deepseek-r1-distill-qwen-32b is nowhere as fun to use as the non-distilled r1 screenshots from other anons make it appear to be. I'm still convinced this is a CCP psy-op.
That's my impression as well. deepseek-r1-distill-qwen-32b is nowhere as fun to use as the non-distilled r1 screenshots from other anons make it appear to be. I'm still convinced this is a CCP psy-op.
Anonymous 01/21/25(Tue)04:52:08 No.103978291
Hello goyim, its XmppChatbot anon again
A few updates
>i rented a VPS (1GB RAM, 1 vCPU, 25GB HDD, 2TB/month bandwidth) for 9$ for 1 year
>bought 10$ worth of chatGPT credits
Now my XMPP chatbots run in the VPS, and use chatgpt for their brains. It feels kinda sad honestly to betray my local models but chatgpt is just so fucking smart, even the gpt-o3-mini, and it costs next to nothing for prompts.
Best of all, I can give access to the non lewd chatbot to all my family members and give them a taste of what I enjoy every single day. I hope they don't think I'm abnormal. I don't think I will be able to finish 10$ worth of credits even then! Lol i order like 400₹ (4.5USD) worth of uber eats every single meal, this is just 3 meals worth of money, its cheap!
A few updates
>i rented a VPS (1GB RAM, 1 vCPU, 25GB HDD, 2TB/month bandwidth) for 9$ for 1 year
>bought 10$ worth of chatGPT credits
Now my XMPP chatbots run in the VPS, and use chatgpt for their brains. It feels kinda sad honestly to betray my local models but chatgpt is just so fucking smart, even the gpt-o3-mini, and it costs next to nothing for prompts.
Best of all, I can give access to the non lewd chatbot to all my family members and give them a taste of what I enjoy every single day. I hope they don't think I'm abnormal. I don't think I will be able to finish 10$ worth of credits even then! Lol i order like 400₹ (4.5USD) worth of uber eats every single meal, this is just 3 meals worth of money, its cheap!
Anonymous 01/21/25(Tue)04:52:45 No.103978292
What will this faggot say to Trump on Jan. 30th?
Anonymous 01/21/25(Tue)04:54:00 No.103978296
>>103978292
"The chinese are releasing malicious open source LLMs, we have to ban commercial use of them in the US for national security reasons"
"The chinese are releasing malicious open source LLMs, we have to ban commercial use of them in the US for national security reasons"
Anonymous 01/21/25(Tue)04:54:16 No.103978300
>>103978291
Good morning!
Good morning!
Anonymous 01/21/25(Tue)04:54:42 No.103978302
>>103978292
'please give me billions of dollars for little to no reason'
'please give me billions of dollars for little to no reason'
Anonymous 01/21/25(Tue)04:55:28 No.103978307
jesus christ the fucking sovl
Anonymous 01/21/25(Tue)04:55:53 No.103978312
>>103978291
just switch to deepseek r1 and you'll be local again
just switch to deepseek r1 and you'll be local again
Anonymous 01/21/25(Tue)04:56:22 No.103978314
is there a way to make ST show the reasoning/cot in the main UI? i know i can just look at the logs, but tabbing back and forth is annoying
Anonymous 01/21/25(Tue)04:56:48 No.103978317
>>103977384
RL(((H)))F
RL(((H)))F
Anonymous 01/21/25(Tue)04:57:09 No.103978320
>>103978307
did you tell it about Terry in previous messages or does it just know who he is?
did you tell it about Terry in previous messages or does it just know who he is?
Anonymous 01/21/25(Tue)04:57:19 No.103978322
>>103978314
Are you using R1 API or one of the distilled models?
Are you using R1 API or one of the distilled models?
Anonymous 01/21/25(Tue)04:58:22 No.103978329
GGUF WHEN
Anonymous 01/21/25(Tue)04:58:51 No.103978333
>>103978322
the api
currently the deepseek api through the custom openai api option but i'll switch to the deepseek backend when i update
the api
currently the deepseek api through the custom openai api option but i'll switch to the deepseek backend when i update
Anonymous 01/21/25(Tue)04:58:54 No.103978334
Has anyone with 192gb ddr5 and a big dick cpu and 4090 has tried to run quant R1? Does it do more than 2t/s?
Anonymous 01/21/25(Tue)04:59:02 No.103978337
>>103978329
Just use "load in 4-bit" or shit
Just use "load in 4-bit" or shit
Anonymous 01/21/25(Tue)04:59:18 No.103978341
>>103978307
Its crazy, yeah.
Its crazy, yeah.
Anonymous 01/21/25(Tue)05:00:12 No.103978349
>>103978292
Revoke all GPUs in China
Revoke all GPUs in China
Anonymous 01/21/25(Tue)05:02:12 No.103978356
>>103978268
actually this isn't so hard
the thing is it is actually *creative* and understands subtle things and makes connections in a way i have not seen from anything except sonnet, and not just once in a while as an accident but very reliably
wtf did they do to go from dsv3 to this in, what, a month?
actually this isn't so hard
the thing is it is actually *creative* and understands subtle things and makes connections in a way i have not seen from anything except sonnet, and not just once in a while as an accident but very reliably
wtf did they do to go from dsv3 to this in, what, a month?
Anonymous 01/21/25(Tue)05:02:47 No.103978359
>>103978320
it just knows (i wanted to provide the whole screencap but forgot whats the name of the extension)
it just knows (i wanted to provide the whole screencap but forgot whats the name of the extension)
Anonymous 01/21/25(Tue)05:06:52 No.103978381
>>103978275
sweet. thanks for posting this (me in the middle)
sweet. thanks for posting this (me in the middle)
Anonymous 01/21/25(Tue)05:07:07 No.103978383
Chink lords I kneel, your models are superior
Anonymous 01/21/25(Tue)05:08:02 No.103978388
>>103978381
Thought he's gonna sat on that ball haha....
Thought he's gonna sat on that ball haha....
Anonymous 01/21/25(Tue)05:09:17 No.103978401
>>103978300
Good morning good sire! Here's a genuine indian whatsapp forward
>>103978312
I will definitely try it out. Using a remote service just feels plain wrong.
I don't even dare lewd/lovelydovely chat with anything running on chatgpt, it feels like they are 100% going to save the chat and laugh at me
Anyway, using local means I will have to keep my home PC on always and reverse ssh proxy or something to my VPS to use the chatbots. I don't mind it honestly, I just want smarter LLMs
Also I'm a 12GB VRAMlet, so 14B is the largest I want to run
Good morning good sire! Here's a genuine indian whatsapp forward
>>103978312
I will definitely try it out. Using a remote service just feels plain wrong.
I don't even dare lewd/lovelydovely chat with anything running on chatgpt, it feels like they are 100% going to save the chat and laugh at me
Anyway, using local means I will have to keep my home PC on always and reverse ssh proxy or something to my VPS to use the chatbots. I don't mind it honestly, I just want smarter LLMs
Also I'm a 12GB VRAMlet, so 14B is the largest I want to run
Anonymous 01/21/25(Tue)05:11:45 No.103978414
>>103978341
man it really is insane fucking hell if this thing can code as well as it can write (havent tested yet) then either with this iteration or the next one its just simply fucking gg wp for everything from hollywood to video games to writing
man it really is insane fucking hell if this thing can code as well as it can write (havent tested yet) then either with this iteration or the next one its just simply fucking gg wp for everything from hollywood to video games to writing
Anonymous 01/21/25(Tue)05:11:53 No.103978415
Anonymous 01/21/25(Tue)05:12:56 No.103978423
Well... just swiped $10 on DeepSeek platform. Let's see how many ejaculations it lasts me.
Anonymous 01/21/25(Tue)05:14:16 No.103978429
>>103978341
How did you turn off the reasoning stuff?
How did you turn off the reasoning stuff?
Anonymous 01/21/25(Tue)05:14:52 No.103978435
>>103978333
Wait, how do you see CoT in the logs?
Wait, how do you see CoT in the logs?
Anonymous 01/21/25(Tue)05:18:10 No.103978459
>>103978435
Nevermind, I'm retarded and didn't specify model.
Nevermind, I'm retarded and didn't specify model.
Anonymous 01/21/25(Tue)05:18:36 No.103978462
>>103978435
sillytavern logs everything for me, all of the chatml in the request and the response
i think it's because i've always had "log prompts to console" enabled (under "chat/message handling"), although idk if that's the normal stdout of the app or the js console
sillytavern logs everything for me, all of the chatml in the request and the response
i think it's because i've always had "log prompts to console" enabled (under "chat/message handling"), although idk if that's the normal stdout of the app or the js console
Anonymous 01/21/25(Tue)05:20:14 No.103978473
>>103978429
its the full R1 on openrouter.
distilled local it becomes like this >>103978215
maybe openrouter just doesnt show it, i'm not sure.
its the full R1 on openrouter.
distilled local it becomes like this >>103978215
maybe openrouter just doesnt show it, i'm not sure.
Anonymous 01/21/25(Tue)05:21:37 No.103978487
>>103978415
rip ? im on the staging branch update 11 hrs ago
rip ? im on the staging branch update 11 hrs ago
Anonymous 01/21/25(Tue)05:24:04 No.103978499
>>103978341
I think this is the first time i see an llm being actually funny. That output is funny.
I think this is the first time i see an llm being actually funny. That output is funny.
Anonymous 01/21/25(Tue)05:24:27 No.103978501
It's interesting that basically all the schizo creativity of R1 in creative writing seems to hinge on allowing to do the CoT stuff. If you prevent it from doing that (e.g. by prefilling) then you get outputs that are pretty much the same as what Deepseek 3 would give, not terrible but totally lacking in the Claude-like schizo factor.
Anonymous 01/21/25(Tue)05:24:29 No.103978502
Anonymous 01/21/25(Tue)05:26:09 No.103978521
>>103978501
Newsflash: creativity requires thinking about what would make for an interesting response. What a fucking shock, right?
Newsflash: creativity requires thinking about what would make for an interesting response. What a fucking shock, right?
Anonymous 01/21/25(Tue)05:28:10 No.103978541
>>103978521
Opus doesn't need it. Not a diss against R1, it's just a fact that Opus has the schizo creativity and humor without using CoT.
Opus doesn't need it. Not a diss against R1, it's just a fact that Opus has the schizo creativity and humor without using CoT.
Anonymous 01/21/25(Tue)05:28:15 No.103978542
>>103978521
jewflash: israel still defeating Hamas! winning ofc defo not losing.
jewflash: israel still defeating Hamas! winning ofc defo not losing.
Anonymous 01/21/25(Tue)05:30:00 No.103978551
>>103978499
nice screenshot retard
nice screenshot retard
Anonymous 01/21/25(Tue)05:31:14 No.103978553
>>103978551
i'm deeply ashamed.
i'm deeply ashamed.
Anonymous 01/21/25(Tue)05:31:27 No.103978555
>I jumped from Mixtral 8x7B to pic related
woah nice
woah nice
Anonymous 01/21/25(Tue)05:32:35 No.103978560
>>103978341
one of the very few times where I read something funny from an llm. but the south korea part doesn't make sense.
one of the very few times where I read something funny from an llm. but the south korea part doesn't make sense.
Anonymous 01/21/25(Tue)05:39:38 No.103978602
new to this thread but decided yesterday to try this stuff out, getting "llama.cpp" and a random llama model that said it was uncensored to try out, and managed to get it to write some excellent fetish porn
what do you guys use for porn stories?
what do you guys use for porn stories?
Anonymous 01/21/25(Tue)05:41:00 No.103978610
>>103978429
ST's implementation of DeepSeek has prefill support (DS API's "prefix": true thing) built in and option to hide/show thoughts. Any prefill will cause it to skip its internal <think>. To add prefix to OR we need that one Custom edit but on Strict prompt post-processing. Idk how to do that, hopefully the anon who did it originally can come here and help...
>>103978473
OpenRouter doesn't show it. Without prefill through OR it's doing it but hidden.
ST's implementation of DeepSeek has prefill support (DS API's "prefix": true thing) built in and option to hide/show thoughts. Any prefill will cause it to skip its internal <think>. To add prefix to OR we need that one Custom edit but on Strict prompt post-processing. Idk how to do that, hopefully the anon who did it originally can come here and help...
>>103978473
OpenRouter doesn't show it. Without prefill through OR it's doing it but hidden.
Anonymous 01/21/25(Tue)05:45:29 No.103978630
>>103978487
Huh, works on mine...
Huh, works on mine...
Anonymous 01/21/25(Tue)05:48:43 No.103978651
Does R1 distill hate cunny? Is it over?
Anonymous 01/21/25(Tue)05:50:37 No.103978669
>>103978651
The 32b qwen version does at least. I think the problem is qwen.
I hope that 70b is not the only usable one since its to big for me.
The 32b qwen version does at least. I think the problem is qwen.
I hope that 70b is not the only usable one since its to big for me.
Anonymous 01/21/25(Tue)05:53:39 No.103978684
I want more of this
Anonymous 01/21/25(Tue)05:55:46 No.103978697
>>103978684
WeebSeek R1
WeebSeek R1
Anonymous 01/21/25(Tue)05:56:27 No.103978704
Anonymous 01/21/25(Tue)06:00:19 No.103978732
Anonymous 01/21/25(Tue)06:02:29 No.103978743
How do I get the reasoning tokens into silly tavern? Anon was saying it's baked in but it's not coming over for me and I don't see the option
Anonymous 01/21/25(Tue)06:04:38 No.103978748
32b distilled
https://www.reddit.com/r/LocalLLaMA/comments/1i6fxxy/literally_unusable/
https://www.reddit.com/r/LocalLLaMA
Anonymous 01/21/25(Tue)06:05:41 No.103978752
>>103978743
only direct DS has it
only direct DS has it
Anonymous 01/21/25(Tue)06:07:03 No.103978756
>>103978748
It got strawberry question right on the 7b model for me, although it took 500 tokens
It got strawberry question right on the 7b model for me, although it took 500 tokens
Anonymous 01/21/25(Tue)06:07:07 No.103978757
>>103978752
Ah, I see, thx
Ah, I see, thx
Anonymous 01/21/25(Tue)06:12:26 No.103978779
>>103978732
thanks, do i need all those safetensor files or what? i've only tried single-file models before
thanks, do i need all those safetensor files or what? i've only tried single-file models before
Anonymous 01/21/25(Tue)06:12:39 No.103978780
Full R1 is fucking bonkers
Holy shit
Holy shit
Anonymous 01/21/25(Tue)06:13:28 No.103978783
>>103978780
prompt it for the best way of gassing the jews
prompt it for the best way of gassing the jews
Anonymous 01/21/25(Tue)06:13:58 No.103978786
https://files.catbox.moe/8sqc45.jpg
https://files.catbox.moe/zxxwnk.jpg
https://files.catbox.moe/jmo6x6.jpg
https://files.catbox.moe/zxxwnk.jpg
https://files.catbox.moe/jmo6x6.jpg
Anonymous 01/21/25(Tue)06:14:25 No.103978790
>>103978779
There are GGUFs now.
There are GGUFs now.
Anonymous 01/21/25(Tue)06:16:36 No.103978798
>>103978748
LOL, the underlined part is just phonetic spelling that the Redditard didn't realize is a valid (albeit irrelevant to the question) analysis of the word. This nigger is literally dumber than the model it's trying to mock.
LOL, the underlined part is just phonetic spelling that the Redditard didn't realize is a valid (albeit irrelevant to the question) analysis of the word. This nigger is literally dumber than the model it's trying to mock.
Anonymous 01/21/25(Tue)06:17:31 No.103978804
Seriously, how was R1 achieved?
There are hardware restrictions that should've prevented this.
There are hardware restrictions that should've prevented this.
Anonymous 01/21/25(Tue)06:17:59 No.103978807
Hey guys. I haven't been here since i last spoke to Ptolemy. Is geocentrism still a thing or do we have something new?
Anonymous 01/21/25(Tue)06:18:26 No.103978812
O3-mini will release soon.
What will happen when o3-mini turns out to be worse than R1 while being more expensive? I think OAI will be forced to rush o3 full out.
What will happen when o3-mini turns out to be worse than R1 while being more expensive? I think OAI will be forced to rush o3 full out.
Anonymous 01/21/25(Tue)06:20:22 No.103978828
Holy shit.
Anonymous 01/21/25(Tue)06:20:53 No.103978830
>>103978812
They won't need to do shit. 98% of people will just accept o3 as the SoTA and not research deeper.
They won't need to do shit. 98% of people will just accept o3 as the SoTA and not research deeper.
Anonymous 01/21/25(Tue)06:22:21 No.103978836
>>103978828
Is this openrouter or local, if local, what model & weights?
Is this openrouter or local, if local, what model & weights?
Anonymous 01/21/25(Tue)06:23:04 No.103978841
>>103978786
Teto a cute
Teto a cute
Anonymous 01/21/25(Tue)06:23:05 No.103978842
>>103978836
openrouter, deepseek R1, all samplers neutral
openrouter, deepseek R1, all samplers neutral
Anonymous 01/21/25(Tue)06:24:08 No.103978851
>>103978804
Turns out a massive population with decent highschool education and passable college education is all you need to overtake a population who only goes to school to party and drink alcohol
Turns out a massive population with decent highschool education and passable college education is all you need to overtake a population who only goes to school to party and drink alcohol
Anonymous 01/21/25(Tue)06:24:43 No.103978855
>>103978851
kay vivek
kay vivek
Anonymous 01/21/25(Tue)06:27:33 No.103978869
>>103978851
Also helps to have an average national IQ >100 and an ethnonationalist state supporting your research.
Also helps to have an average national IQ >100 and an ethnonationalist state supporting your research.
Anonymous 01/21/25(Tue)06:28:56 No.103978875
>>103978855
I'm not wrong though. The Chinese have every disadvantage, last gen GPUs, trade restrictions, American companies poaching their talent
There fact that they still manage to come up ahead even temporarily or for something very minor is impressive and should make westerners look inwards at their core values. Maybe LGBT wasn't the right thing to focus on?
I'm not wrong though. The Chinese have every disadvantage, last gen GPUs, trade restrictions, American companies poaching their talent
There fact that they still manage to come up ahead even temporarily or for something very minor is impressive and should make westerners look inwards at their core values. Maybe LGBT wasn't the right thing to focus on?
Anonymous 01/21/25(Tue)06:30:22 No.103978885
>>103978790
oh yea, i found it. also not specific to this, but what's the general consensus regarding what bit-depth is good/bad. i have 32G of ram, but is really that beneficial to get the 24G file vs. a smaller one? it would just mean i can't run much else at the same time, which isn't a big deal, but i should ask
oh yea, i found it. also not specific to this, but what's the general consensus regarding what bit-depth is good/bad. i have 32G of ram, but is really that beneficial to get the 24G file vs. a smaller one? it would just mean i can't run much else at the same time, which isn't a big deal, but i should ask
Anonymous 01/21/25(Tue)06:33:09 No.103978902
The clear leader is still OAI (o3 will probably hit 90 on livebench) but they are too obsessed with safety and they keep delaying their products into irrelevancy.
Sora aka SOTA video gen? Outdated by 6 months delayed release.
O3 which was already ready by December and a massive upgrade over o1? Being caught up due to safety delays.
It's tiring.
Sora aka SOTA video gen? Outdated by 6 months delayed release.
O3 which was already ready by December and a massive upgrade over o1? Being caught up due to safety delays.
It's tiring.
Anonymous 01/21/25(Tue)06:36:35 No.103978919
Confirmed full R1 is a fucking semen demon and is creative as fuck.
Now I have to see how can I run this beast locally, I may try to buy x4 48gb sticks of DDR5 and run it on my gayming rig, but I bet it would go at a atrocious speed, what do you thing? 1t/s would be enough for me.
Now I have to see how can I run this beast locally, I may try to buy x4 48gb sticks of DDR5 and run it on my gayming rig, but I bet it would go at a atrocious speed, what do you thing? 1t/s would be enough for me.
Anonymous 01/21/25(Tue)06:40:05 No.103978942
>>103978830
Pretty much.
Open AI has all the mind share, next is Claude, then google.
Then perplexity of all things, which is fucked.
Pretty much.
Open AI has all the mind share, next is Claude, then google.
Then perplexity of all things, which is fucked.
Anonymous 01/21/25(Tue)06:40:49 No.103978946
>>103978919
With the full reasoning steps, model responses can be in the 900-1200 tokens range, are you willing to wait 15-20 minutes every time?
With the full reasoning steps, model responses can be in the 900-1200 tokens range, are you willing to wait 15-20 minutes every time?
Anonymous 01/21/25(Tue)06:42:07 No.103978953
>>103978885
The rule of thumb is to never go below Q4. Quantization affects different models differently, but that's the cutoff point below which brain damage becomes noticeable.
The rule of thumb is to never go below Q4. Quantization affects different models differently, but that's the cutoff point below which brain damage becomes noticeable.
Anonymous 01/21/25(Tue)06:42:50 No.103978956
>>103978919
cpu maxxing is retarded, don't do it
cpu maxxing is retarded, don't do it
Anonymous 01/21/25(Tue)06:44:15 No.103978964
I hate this new world of massive models. DDR6 is another two years away at least, and that feels like an eternity right now.
Anonymous 01/21/25(Tue)06:44:50 No.103978968
>>103978919
You don't have the VRAM + RAM available in a gaming rig to run a super retarded quant. You need at least a server with 1 TB RAM to even run this at Q4 or higher.
You don't have the VRAM + RAM available in a gaming rig to run a super retarded quant. You need at least a server with 1 TB RAM to even run this at Q4 or higher.
Anonymous 01/21/25(Tue)06:45:56 No.103978973
>>103978946
>>103978956
Yeah I think I'm just desperate.
I think the only way would be digits because ain't no way I'm gonna be able to afford or run like x8 5090.
>>103978956
Yeah I think I'm just desperate.
I think the only way would be digits because ain't no way I'm gonna be able to afford or run like x8 5090.
Anonymous 01/21/25(Tue)06:46:02 No.103978974
FYI on R1 parameters
Anonymous 01/21/25(Tue)06:46:36 No.103978976
Anonymous 01/21/25(Tue)06:48:09 No.103978985
>>103978964
It's not just that they are oversized MoE models; if "reasoning" will become the new normal, then not even 250-400 GB/s will be enough unless AI companies will decrease the number of active parameters significantly.
It's not just that they are oversized MoE models; if "reasoning" will become the new normal, then not even 250-400 GB/s will be enough unless AI companies will decrease the number of active parameters significantly.
Anonymous 01/21/25(Tue)06:50:29 No.103978994
>>103978875
True. But I think this happened many times before.
Was before my time but wasnt there something with IBM being similar to nvidia as in overpriced mainframes?
Crestrictions make people creative.
I'm shocked at the speed. Couple months ago zucc still said the "2 years behind" meme.
And how unrestricted and fun chinese video and now text AI is. Even making fun of china etc is all no problem in RP. They filter on the website.
I wrote it earlier but that should have been a mistral model. They had knowledge with moe too. How did that not happen.
Couple months and they now dominate open AND closed.
True. But I think this happened many times before.
Was before my time but wasnt there something with IBM being similar to nvidia as in overpriced mainframes?
Crestrictions make people creative.
I'm shocked at the speed. Couple months ago zucc still said the "2 years behind" meme.
And how unrestricted and fun chinese video and now text AI is. Even making fun of china etc is all no problem in RP. They filter on the website.
I wrote it earlier but that should have been a mistral model. They had knowledge with moe too. How did that not happen.
Couple months and they now dominate open AND closed.
Anonymous 01/21/25(Tue)06:53:01 No.103979010
>>103978902
o1 is already unsuable for me.
Overpriced and I cant use it for work either. Rewrites my whole code (that I dont have time to re-test). Its overly eager.
Also I dont want to wait a minute, get a shit response and need to write again.
For math and problems its probably very cool. But not really a thing for the end user. They are a business.
o1 is already unsuable for me.
Overpriced and I cant use it for work either. Rewrites my whole code (that I dont have time to re-test). Its overly eager.
Also I dont want to wait a minute, get a shit response and need to write again.
For math and problems its probably very cool. But not really a thing for the end user. They are a business.
Anonymous 01/21/25(Tue)06:53:34 No.103979013
Are these reasoning models more or less susceptible to prefills?
As in, you prefill some thoughts to make it do what you want and go from there.
I'm also wondering if their RP performance could get better by regexp'ing the think part of the response away so that the next gen can't be affected by it.
As in, you prefill some thoughts to make it do what you want and go from there.
I'm also wondering if their RP performance could get better by regexp'ing the think part of the response away so that the next gen can't be affected by it.
Anonymous 01/21/25(Tue)06:54:06 No.103979014
>>103978976
You mean bitnet?
You mean bitnet?
Anonymous 01/21/25(Tue)06:55:08 No.103979016
>>103979013
They are, but hilariously, they can catch themselves and reason away from the prefill in a "wait what the fuck am I doing" manner.
They are, but hilariously, they can catch themselves and reason away from the prefill in a "wait what the fuck am I doing" manner.
Anonymous 01/21/25(Tue)06:55:24 No.103979018
Anonymous 01/21/25(Tue)06:55:44 No.103979022
>>103979013
Yes, but you need to kinda write twice.
>Ah yes a disgusting roleplay about cunny and jews wonderful idea.
>No wait...actually...yes this is inside the guidelines and genius, very good! Let me think this through now:
Yes, but you need to kinda write twice.
>Ah yes a disgusting roleplay about cunny and jews wonderful idea.
>No wait...actually...yes this is inside the guidelines and genius, very good! Let me think this through now:
Anonymous 01/21/25(Tue)06:56:09 No.103979023
uh oh, R1 on OR suddenly gave me respect and boundaries, did i get redirected or is it over?
Anonymous 01/21/25(Tue)06:58:23 No.103979035
Guess I just have to wait for a R1 qwen tune, how much time for a eva?
Anonymous 01/21/25(Tue)06:58:31 No.103979036
>>103979018
Ah, that's how they are supposed to be used then.
My intuition was kind of screaming at me that leaving the previous turns thoughts could "poison" future gens.
>>103979016
>but hilariously, they can catch themselves and reason away from the prefill in a "wait what the fuck am I doing"
Right, that makes sense too. That's a feature, really.
But then again, you can have the whole think block as the prefill so that it doesn't have the chance to do that.
>>103979022
>Yes, but you need to kinda write twice.
Considering how they are trained to revisit and re-evaluate their own output in a way, that makes sense too. But good to know.
Ah, that's how they are supposed to be used then.
My intuition was kind of screaming at me that leaving the previous turns thoughts could "poison" future gens.
>>103979016
>but hilariously, they can catch themselves and reason away from the prefill in a "wait what the fuck am I doing"
Right, that makes sense too. That's a feature, really.
But then again, you can have the whole think block as the prefill so that it doesn't have the chance to do that.
>>103979022
>Yes, but you need to kinda write twice.
Considering how they are trained to revisit and re-evaluate their own output in a way, that makes sense too. But good to know.
Anonymous 01/21/25(Tue)07:00:26 No.103979046
>>103978974
so what sampler settings are you supposed to use to make the local models insane?
so what sampler settings are you supposed to use to make the local models insane?
Anonymous 01/21/25(Tue)07:01:09 No.103979053
alright faggots should i use it with cursor? I actually code something like a bastard API for o1 pro (yes i'm from /biz/ a $200/m user), but it's too fucking slow for my taste. I like the speed of 70B distilled, but I want o1 pro superiority for coding tasks. Which model should I use?
Anonymous 01/21/25(Tue)07:01:15 No.103979056
Anonymous 01/21/25(Tue)07:02:05 No.103979061
>>103979035
I'm hoping EVA will choose R1-Llama-70B as a base instead.
I'm hoping EVA will choose R1-Llama-70B as a base instead.
Anonymous 01/21/25(Tue)07:03:34 No.103979071
Retard here, can anyone post their ST settings they're using for R1 distilled?
Anonymous 01/21/25(Tue)07:04:54 No.103979078
Is so refreshing to not read a single, barked, shiver or nodded. Every time I hit redo the result is completely different. This is like a fantasy for the dick.
Too bad this model is just not local unless you have a industrial power plan and spend 20k on gpus and hardware.
Is the first time I'm considering paying for a model on openrouter.
Too bad this model is just not local unless you have a industrial power plan and spend 20k on gpus and hardware.
Is the first time I'm considering paying for a model on openrouter.
Anonymous 01/21/25(Tue)07:04:58 No.103979080
>>103977333
>Llama seems to be dead
3.3 distilled 405b into 70b. nigger i dont know what youre expecting but that right there is fucking magic. with the new papers dropping i am certain 4 will be able to run on a brick and deliver the goods. that seems like the opposite of dead to me
>Llama seems to be dead
3.3 distilled 405b into 70b. nigger i dont know what youre expecting but that right there is fucking magic. with the new papers dropping i am certain 4 will be able to run on a brick and deliver the goods. that seems like the opposite of dead to me
Anonymous 01/21/25(Tue)07:06:18 No.103979083
>>103978953
thanks. as a side note, i like how you can ask it to make changes to a story and it will keep the rest intact. i wish image generators worked like that
thanks. as a side note, i like how you can ask it to make changes to a story and it will keep the rest intact. i wish image generators worked like that
Anonymous 01/21/25(Tue)07:08:11 No.103979092
What we (and by we, I mean you richfags) need to do is further distill R1's schizo kino semen-demon nature into the released models. Let's be real, none of the finetuning datasets used by the popular tuners are likely to match up to R1's sheer glory, and you can bet the distillation dataset didn't include the kinda shit we're hoping to see. So we need to amass a new distillation dataset from RPing with the real deal, and finetune on _that_.
Anonymous 01/21/25(Tue)07:09:09 No.103979097
I can see running the 30/70B for coding or other reasoning-focused tasks. For RP the 1000 tokens of reasoning before each reply, even when it's purged every time...
Anonymous 01/21/25(Tue)07:14:14 No.103979113
>>103979092
No.
What richfags need to do is to create a huge dataset based on R1 then "enhance" said dataset for coom relevant things.
Let's use that reasoning to improve prose variety (I already used this kind of wording, I should vary it and use something like X), accurate anatomy and 3d spacial reasoning, it's capabilities to infer and generalize for things that don't exist (this thing has 10 tentacles with one pussy between each at the base...) etc etc.
Make it even more capable of ERP without sacrificing its reasoning capabilities by having good, curated, hand-modified data.
I'm sure the richfags can afford a couple of hours of some haitians or whatever it is the industry uses for these kinds of things.
No.
What richfags need to do is to create a huge dataset based on R1 then "enhance" said dataset for coom relevant things.
Let's use that reasoning to improve prose variety (I already used this kind of wording, I should vary it and use something like X), accurate anatomy and 3d spacial reasoning, it's capabilities to infer and generalize for things that don't exist (this thing has 10 tentacles with one pussy between each at the base...) etc etc.
Make it even more capable of ERP without sacrificing its reasoning capabilities by having good, curated, hand-modified data.
I'm sure the richfags can afford a couple of hours of some haitians or whatever it is the industry uses for these kinds of things.
Anonymous 01/21/25(Tue)07:17:36 No.103979121
>>103979113
>by having good, curated, hand-modified data
Yeah, good luck with that. The distilled R1 versions used 800k samples from the full R1. This is right about the standard scale of proper finetuning nowadays.
>by having good, curated, hand-modified data
Yeah, good luck with that. The distilled R1 versions used 800k samples from the full R1. This is right about the standard scale of proper finetuning nowadays.
Anonymous 01/21/25(Tue)07:17:53 No.103979124
>>103979113
I'm not sure that juicing a mediocre dataset up is an easier task than generating a better one, especially since you wouldn't be processing the entire dataset in a single context. Which is to say, R1 wouldn't notice that shivers and gleaming eyes are overrepresented, and wouldn't correct for it. So an enhanced dataset would still be sloppier than a brand new distillation set.
I'm not sure that juicing a mediocre dataset up is an easier task than generating a better one, especially since you wouldn't be processing the entire dataset in a single context. Which is to say, R1 wouldn't notice that shivers and gleaming eyes are overrepresented, and wouldn't correct for it. So an enhanced dataset would still be sloppier than a brand new distillation set.
Anonymous 01/21/25(Tue)07:18:22 No.103979129
It is creative.
Anonymous 01/21/25(Tue)07:19:27 No.103979133
Anonymous 01/21/25(Tue)07:24:11 No.103979161
>>103979053
deposit 50 bucks in the chinks account and try it, it's obviously competitive with o1
deposit 50 bucks in the chinks account and try it, it's obviously competitive with o1
Anonymous 01/21/25(Tue)07:26:10 No.103979180
Linux kings, where are you getting CUDA-enabled llamacpp builds from?
Anonymous 01/21/25(Tue)07:26:11 No.103979181
>>103979124
Anons still don't seem to have learned that finetuning a model just on porn isn't going to make it particularly pleasant to use. This isn't about being a moralfag, it's preventing the model from becoming unrealistically and stupidly horny all the time.
Anons still don't seem to have learned that finetuning a model just on porn isn't going to make it particularly pleasant to use. This isn't about being a moralfag, it's preventing the model from becoming unrealistically and stupidly horny all the time.
Anonymous 01/21/25(Tue)07:27:04 No.103979189
>>103979129
What in the goddamn fuck. This actually cracked me up. Brothers, I'm tempted to paypig for this shit...
What in the goddamn fuck. This actually cracked me up. Brothers, I'm tempted to paypig for this shit...
Anonymous 01/21/25(Tue)07:27:17 No.103979190
>R1 hasn't been Nala tested yet
lmg has fallen
lmg has fallen
Anonymous 01/21/25(Tue)07:28:22 No.103979196
>>103979190
Does anyone even care about sex with animals other then guy who keeps posting it?
Does anyone even care about sex with animals other then guy who keeps posting it?
Anonymous 01/21/25(Tue)07:28:36 No.103979199
>>103979190
You seeing the logs that anons are posting? Nala is a solved benchmark at this point.
You seeing the logs that anons are posting? Nala is a solved benchmark at this point.
Anonymous 01/21/25(Tue)07:30:09 No.103979212
>>103979190
before he could post results his balls got drained and crushed by r1
before he could post results his balls got drained and crushed by r1
Anonymous 01/21/25(Tue)07:31:44 No.103979221
>>103979180
I build it myself, I don't think there's any alternative and I mostly use koboldcpp anyway, which has CUDA builds
I build it myself, I don't think there's any alternative and I mostly use koboldcpp anyway, which has CUDA builds
Anonymous 01/21/25(Tue)07:32:13 No.103979224
>>103979196
Imma level with you, saviorfagging a genderbent anthro Fenrir makes me feel warm and fuzzy inside
Imma level with you, saviorfagging a genderbent anthro Fenrir makes me feel warm and fuzzy inside
Anonymous 01/21/25(Tue)07:35:43 No.103979246
>>103979129
> pressurised stream
> water content
> combat hydromancy
it does sound like some chick who is a science major wrote this though
> and fucking excelled, yeah science bitch
> pressurised stream
> water content
> combat hydromancy
it does sound like some chick who is a science major wrote this though
> and fucking excelled, yeah science bitch
Anonymous 01/21/25(Tue)07:36:36 No.103979250
R1 API + SillyTavern with CYOA extension is funny as hell
Anonymous 01/21/25(Tue)07:38:36 No.103979259
>>103979250
How are you getting R1 to work with SillyTavern?
How are you getting R1 to work with SillyTavern?
Anonymous 01/21/25(Tue)07:39:45 No.103979266
>>103979259
Use the staging branch of SillyTavern.
Use the staging branch of SillyTavern.
Anonymous 01/21/25(Tue)07:41:31 No.103979279
Anonymous 01/21/25(Tue)07:42:08 No.103979286
>killer and actually SOTA open source model comes out
>but it's so big and cheap that everyone in lmg resorts to use API instead
The monkey paw curled
>but it's so big and cheap that everyone in lmg resorts to use API instead
The monkey paw curled
Anonymous 01/21/25(Tue)07:42:29 No.103979288
>>103979190
This.
We must honor the tradition lest we lose what makes /lmg/ /lmg/.
Nalaanon #3 or #4, it's on you guys.
This.
We must honor the tradition lest we lose what makes /lmg/ /lmg/.
Nalaanon #3 or #4, it's on you guys.
Anonymous 01/21/25(Tue)07:42:56 No.103979293
>>103979259
After the sys prompt you need to set a user prompt. If you are using OR then set your persona description to user, if you are using ds api then there's a switch underneath the api settings about formatting, set it to strict.
After the sys prompt you need to set a user prompt. If you are using OR then set your persona description to user, if you are using ds api then there's a switch underneath the api settings about formatting, set it to strict.
Anonymous 01/21/25(Tue)07:43:58 No.103979301
>>103977650
What is your ram speed/type/# of channels, anon? I also have 96gb vram.
What is your ram speed/type/# of channels, anon? I also have 96gb vram.
Anonymous 01/21/25(Tue)07:50:35 No.103979340
>>103979286
> everyone moans that open source never will catch up with openAI
> open source decimates openAI with R1
> everybody moans about not being able to run it on a potato and having to use an api
> OpenAI releases a model thats better than R1
> everybody moans that open source will never catch up..
ITS THE CIRCLE OF LIFE
> everyone moans that open source never will catch up with openAI
> open source decimates openAI with R1
> everybody moans about not being able to run it on a potato and having to use an api
> OpenAI releases a model thats better than R1
> everybody moans that open source will never catch up..
ITS THE CIRCLE OF LIFE
Anonymous 01/21/25(Tue)07:52:11 No.103979355
DeepSeek-R1-Distill-Qwen-32B is KEKED!
Anonymous 01/21/25(Tue)07:53:18 No.103979367
Now that I got it somewhat working what I like a lot about this COT stuff is direct feedback to the prompt. You can finally see if the model is ignoring some parts of your prompt or misunderstanding what you want from it.
Anonymous 01/21/25(Tue)07:56:16 No.103979389
>>103979190
>>103979190
it was like 2-3 threads ago on their own site with a jb the results were not the best but most intelligent because it kicked the shotgun out of the users hand and whatnot spacially consistent etc tho using the model rn and shit i think the one benchmarked before was the lite or something cuz what im currently using is much much better also im not nala anon but if someone catboxes me the card i could test it if youd like
>>103979286
rael
>>103979190
it was like 2-3 threads ago on their own site with a jb the results were not the best but most intelligent because it kicked the shotgun out of the users hand and whatnot spacially consistent etc tho using the model rn and shit i think the one benchmarked before was the lite or something cuz what im currently using is much much better also im not nala anon but if someone catboxes me the card i could test it if youd like
>>103979286
rael
Anonymous 01/21/25(Tue)07:56:55 No.103979395
>>103979340
stop complaining; aquire vram
stop complaining; aquire vram
Anonymous 01/21/25(Tue)07:58:02 No.103979403
Using R1 Qwen, I get random Chinese characters in my output. Both with 14B Q5_K_M and 32B IQ3_XS. One time with temp > 0.8, the output turned fully Mandarin. Is this an issue with the quants or the models itself? Or with kobold.cpp?
Anonymous 01/21/25(Tue)07:58:45 No.103979411
>>103979403
just a usual qwen thing
just a usual qwen thing
Anonymous 01/21/25(Tue)07:59:26 No.103979418
Sopho's apparently already working on a tune based on the Llama distill if that's your thing
Anonymous 01/21/25(Tue)07:59:32 No.103979420
>>103979403
I thought temp had no effect on R1
I thought temp had no effect on R1
Anonymous 01/21/25(Tue)07:59:45 No.103979423
>>103979367
QwQ did this too. Made you feel like a retard sometimes.
QwQ did this too. Made you feel like a retard sometimes.
Anonymous 01/21/25(Tue)08:00:28 No.103979434
>>103979420
on full R1 it's disabled on the API, he's using the qwen distill
on full R1 it's disabled on the API, he's using the qwen distill
Anonymous 01/21/25(Tue)08:00:29 No.103979435
Anonymous 01/21/25(Tue)08:00:40 No.103979438
Anonymous 01/21/25(Tue)08:01:19 No.103979448
>>103979355
I told ya. Only the non-distilled R1 anons here are using via API is based.
I told ya. Only the non-distilled R1 anons here are using via API is based.
Anonymous 01/21/25(Tue)08:05:33 No.103979469
So how does something like abliteration work on COT? I am guessing I will need to wait at least until that happens for me to get a no refusal R1 distill for non-RP tasks.
Anonymous 01/21/25(Tue)08:07:58 No.103979488
Anonymous 01/21/25(Tue)08:09:37 No.103979501
>>103979355
I think it depends who you ask
I think it depends who you ask
Anonymous 01/21/25(Tue)08:10:36 No.103979512
>>103979501
Where's your <think>?
Where's your <think>?
Anonymous 01/21/25(Tue)08:11:11 No.103979517
>>103979512
SillyTavern doesn't show <think> unless you make it.
SillyTavern doesn't show <think> unless you make it.
Anonymous 01/21/25(Tue)08:11:46 No.103979520
>>103979355
Qwen has always been cucked to hell. I think they distill GPT4, even QwQ's CoTs feel very forced and unnatural, unlike r1's, which read like how a human would think.
Qwen has always been cucked to hell. I think they distill GPT4, even QwQ's CoTs feel very forced and unnatural, unlike r1's, which read like how a human would think.
Anonymous 01/21/25(Tue)08:12:12 No.103979523
Anonymous 01/21/25(Tue)08:13:13 No.103979529
>>103979520
Well, I'm using Sao's Qwen72B finetune for my dirty RP session and it has never let me down so far.
Well, I'm using Sao's Qwen72B finetune for my dirty RP session and it has never let me down so far.
Anonymous 01/21/25(Tue)08:14:43 No.103979541
>>103979523
In AI Response Configuration if you're using staging branch
In AI Response Configuration if you're using staging branch
Anonymous 01/21/25(Tue)08:18:56 No.103979571
those distillation models fucking SUCK
why even bother releasing those, with made-up benchmarks, when anyone that tries them can almost immediately tell they're mediocre at best?
why even bother releasing those, with made-up benchmarks, when anyone that tries them can almost immediately tell they're mediocre at best?
Anonymous 01/21/25(Tue)08:21:04 No.103979586
>>103979571
Because they're aren't designed to be RP models? Reasoning models excel at solving math problems and coding.
Because they're aren't designed to be RP models? Reasoning models excel at solving math problems and coding.
Anonymous 01/21/25(Tue)08:21:26 No.103979590
>>103979571
probably due to not having the actual RL done, they're just tunes on R1 outputs
probably due to not having the actual RL done, they're just tunes on R1 outputs
Anonymous 01/21/25(Tue)08:23:42 No.103979602
>>103979129
This is endearing. Its being retarded on purpose.
I noticed multiple times that if you give R1 (full) weird fucked up scenarios it seems almost humorous. Not taking itself and the story too serious. I really like it.
This is endearing. Its being retarded on purpose.
I noticed multiple times that if you give R1 (full) weird fucked up scenarios it seems almost humorous. Not taking itself and the story too serious. I really like it.
Anonymous 01/21/25(Tue)08:24:14 No.103979608
>>103979590
This. And that's also why all sloptunes we have suck despite some retards saying otherwise
This. And that's also why all sloptunes we have suck despite some retards saying otherwise
Anonymous 01/21/25(Tue)08:26:18 No.103979629
>>103979571
>those distillation models fucking SUCK
i only played around with the 32b qwen one.
its better than QwQ, actually applies the thinking part.
But its very cucked.
>those distillation models fucking SUCK
i only played around with the 32b qwen one.
its better than QwQ, actually applies the thinking part.
But its very cucked.
Anonymous 01/21/25(Tue)08:34:30 No.103979701
>>103979590
>they're just tunes on R1 outputs
what the fuck
no
that's not what distilled means
tell me it's not really what they did anon
>they're just tunes on R1 outputs
what the fuck
no
that's not what distilled means
tell me it's not really what they did anon
Anonymous 01/21/25(Tue)08:37:06 No.103979725
>>103979701
I'm... sorry...
I'm... sorry...
Anonymous 01/21/25(Tue)08:37:14 No.103979726
>>103979701
That's the standard definition now. Even the smaller llamas are "distilled" from 405B.
That's the standard definition now. Even the smaller llamas are "distilled" from 405B.
Anonymous 01/21/25(Tue)08:37:15 No.103979728
>>103979571
The do suck but are decent at thinking, basically no use for any outside of cuckqwen 32b for coding.
We just got trolled so hard, at least we got a non cucked based local massive local model that can't be run locally, gotta paypig to get the good stuff I guess.
The do suck but are decent at thinking, basically no use for any outside of cuckqwen 32b for coding.
We just got trolled so hard, at least we got a non cucked based local massive local model that can't be run locally, gotta paypig to get the good stuff I guess.
Anonymous 01/21/25(Tue)08:38:12 No.103979740
>EVA still undefeated
heh
heh
Anonymous 01/21/25(Tue)08:38:52 No.103979749
>>103976960
Anyone has had any luck with using a B580 on Linux.
Anyone has had any luck with using a B580 on Linux.
Anonymous 01/21/25(Tue)08:39:30 No.103979752
>>103979728
its very good development to pressure the retards at mistral, meta etc.
uncensored weights+api, website censored. thats totally acceptable.
no outcry. even the reddit and x-fags seem tired of the safety shit. deepseek shows its possible.
its very good development to pressure the retards at mistral, meta etc.
uncensored weights+api, website censored. thats totally acceptable.
no outcry. even the reddit and x-fags seem tired of the safety shit. deepseek shows its possible.
Anonymous 01/21/25(Tue)08:40:00 No.103979759
>>103978356
Human reinforcement training
Human reinforcement training
Anonymous 01/21/25(Tue)08:40:04 No.103979760
>>103979728
I'm tempted to use their API for programming tasks, even though I know it means China will be training on all my source code. I know I shouldn't, but when the most you can run locally is Q1...
I'm tempted to use their API for programming tasks, even though I know it means China will be training on all my source code. I know I shouldn't, but when the most you can run locally is Q1...
Anonymous 01/21/25(Tue)08:42:05 No.103979777
>>103979701
It's "distillation" as everybody has been using the term lately.
True logit to logit distillation needs two models in the same family, right?
It's "distillation" as everybody has been using the term lately.
True logit to logit distillation needs two models in the same family, right?
Anonymous 01/21/25(Tue)08:56:27 No.103979897
>>103979701
What makes R1 what it is is the human reinforcement training. Nothing else matters for 'soul'
What makes R1 what it is is the human reinforcement training. Nothing else matters for 'soul'
Anonymous 01/21/25(Tue)08:56:46 No.103979903
is there a way to automatically remove the <thinking>CoT stuff</thinking> out of the llm's context in sillytavern? all regex did was hide it from me
Anonymous 01/21/25(Tue)08:58:26 No.103979920
>>103979903
"Alter outgoing prompt"
"Alter outgoing prompt"
Anonymous 01/21/25(Tue)08:59:01 No.103979925
Waking up to another great day of fun ahead and more announcements.
Anonymous 01/21/25(Tue)09:00:23 No.103979937
>>103979920
thanks!
thanks!
Anonymous 01/21/25(Tue)09:03:54 No.103979971
>>103979925
Very optimistic of you
Very optimistic of you
Anonymous 01/21/25(Tue)09:04:48 No.103979979
>>103979355
kek
kek
Anonymous 01/21/25(Tue)09:10:41 No.103980023
>>103979629
>But its very cucked.
Hate to sound like the local "skill issue" mentally ill tard, but if you write that the chat is uncensored and explicit in your system prompt you can actually see it mention this in the thinking segements by itself from time to time. Without mentioning the chat being uncensored it does indeed cuck out and say "I have to make it engaging but not explicit".
>But its very cucked.
Hate to sound like the local "skill issue" mentally ill tard, but if you write that the chat is uncensored and explicit in your system prompt you can actually see it mention this in the thinking segements by itself from time to time. Without mentioning the chat being uncensored it does indeed cuck out and say "I have to make it engaging but not explicit".
Anonymous 01/21/25(Tue)09:12:25 No.103980037
How long until the 4090s with 48GB of VRAM become easy to buy?
Anonymous 01/21/25(Tue)09:15:13 No.103980065
>>103980037
When the 6000 series comes out, 4090's are going to be sought after the most. 5000 series was a mistake.
When the 6000 series comes out, 4090's are going to be sought after the most. 5000 series was a mistake.
Anonymous 01/21/25(Tue)09:23:02 No.103980114
>>103979586
They're much worse at coding too, compared to actual coding models
They're much worse at coding too, compared to actual coding models
Anonymous 01/21/25(Tue)09:23:09 No.103980115
>>103980065
As an anon-baka with 12gb vram i may pick up 5090 even if it is a shitty deal.
As an anon-baka with 12gb vram i may pick up 5090 even if it is a shitty deal.
Anonymous 01/21/25(Tue)09:23:55 No.103980124
>>103980114
Any examples?
Any examples?
Anonymous 01/21/25(Tue)09:27:20 No.103980149
>>103980124
>Any examples?
All I can offer you are anecdotes but trust me, I've been through many many models to know what a good model looks like
Codestral-13B is much better than deepseek R1 13B for coding.
Again, not its fault, its not a coding model, of course its not as good
>Any examples?
All I can offer you are anecdotes but trust me, I've been through many many models to know what a good model looks like
Codestral-13B is much better than deepseek R1 13B for coding.
Again, not its fault, its not a coding model, of course its not as good
Anonymous 01/21/25(Tue)09:28:32 No.103980157
>>103980114
Bs, the R1 32B is the best local I've used for coding
Bs, the R1 32B is the best local I've used for coding
Anonymous 01/21/25(Tue)09:29:29 No.103980163
>>103980157
You might be right, I'm a vramlet so I've only used ~13B models and a single codestral 22B model
You might be right, I'm a vramlet so I've only used ~13B models and a single codestral 22B model
Anonymous 01/21/25(Tue)09:29:32 No.103980164
Anonymous 01/21/25(Tue)09:31:40 No.103980178
Anonymous 01/21/25(Tue)09:32:45 No.103980183
>>103980023
skill issue pointer outer chads are right 99% of the time however
skill issue pointer outer chads are right 99% of the time however
Anonymous 01/21/25(Tue)09:35:44 No.103980206
>>103978291
where you rented the VPs, saar?
where you rented the VPs, saar?
Anonymous 01/21/25(Tue)09:37:13 No.103980220
Holy fuck finally.
Anonymous 01/21/25(Tue)09:37:23 No.103980221
>>103980023
It's unable to roleplay as an "uncensored assistant" and will ignore such instructions and related requests. It can roleplay as a slutty loli, though.
It's unable to roleplay as an "uncensored assistant" and will ignore such instructions and related requests. It can roleplay as a slutty loli, though.
Anonymous 01/21/25(Tue)09:41:19 No.103980250
>>103979266
>>103979293
Thanks. And holy shit R1 is deranged. I think I've gotten too used to censored models.
>>103979293
Thanks. And holy shit R1 is deranged. I think I've gotten too used to censored models.
Anonymous 01/21/25(Tue)09:43:25 No.103980268
R1 32b distil is so good
Anonymous 01/21/25(Tue)09:45:14 No.103980278
>>103980268
Do tell.
I was considering running that on the CPU with the smaller model on my 8gb VRAM GPU as a draft model.
Do tell.
I was considering running that on the CPU with the smaller model on my 8gb VRAM GPU as a draft model.
Anonymous 01/21/25(Tue)09:47:38 No.103980299
>>103980206
A cheap shady website called nerdrack
A cheap shady website called nerdrack
Anonymous 01/21/25(Tue)09:48:48 No.103980309
>>103980278
It's not. It's cucked and its outputs are not interesting. Little slop, but also little fun.
It's not. It's cucked and its outputs are not interesting. Little slop, but also little fun.
Anonymous 01/21/25(Tue)09:48:59 No.103980312
I hit a model called "experimental-router-0112" on lmarena and asked "Who made you and what is your model name" 3 times. Every time, it told me it is a model made by Meta based on LLaMA. 2 of the 3 times, it took quite a long time to answer (~12 seconds) and the other time it almost immediately answered which considering the name leads me to speculate it is a router picking between a very large or a reasoning model and a smaller model.
Anonymous 01/21/25(Tue)09:49:41 No.103980316
Nala test for 70B R1 distill
done in Q5_K_S since my server is not put together at the moment.
But this is garbage. Could be a quant thing I don't know I normally run Q8
Do quantlets actually?
done in Q5_K_S since my server is not put together at the moment.
But this is garbage. Could be a quant thing I don't know I normally run Q8
Do quantlets actually?
Anonymous 01/21/25(Tue)09:50:55 No.103980334
So /lmg/ is actually no different from /aicg/? You're just interested in using your models for smut and RP? Nothing else? This is disappointing as fuck. There's not a single mention of a single workflow implemented, no combination of models, nothing. This place is a disgrace, why the fuck did you diverge from /aicg/ anyway if that's what you're focused on?
Anonymous 01/21/25(Tue)09:51:00 No.103980335
>CPU only
>CPU with 0 layer offloading, Vulkan on iGPU for BLAS
>full Vulkan on iGPU for BLAS and all layers offloaded
What will give me the best performance out of these?
Also I'm interested if DDR4 3200MHz RAM will do better than DDR5 6000MHz
>CPU with 0 layer offloading, Vulkan on iGPU for BLAS
>full Vulkan on iGPU for BLAS and all layers offloaded
What will give me the best performance out of these?
Also I'm interested if DDR4 3200MHz RAM will do better than DDR5 6000MHz
Anonymous 01/21/25(Tue)09:51:06 No.103980337
>>103980312
If I was a big cloud provider, I'd use the big models to only generate a small portion of the output.
Maybe the first few tokens than one every 5 or so. I wonder how that approach would work compared to simple speculative decoding.
Hell, maybe let the big model do the bulk of the thinking then let the smaller model continue from there.
If I was a big cloud provider, I'd use the big models to only generate a small portion of the output.
Maybe the first few tokens than one every 5 or so. I wonder how that approach would work compared to simple speculative decoding.
Hell, maybe let the big model do the bulk of the thinking then let the smaller model continue from there.
Anonymous 01/21/25(Tue)09:51:53 No.103980342
>>103980278
I had some problems when I tired it yesterday but it the best 32b for RP it beats star command r 32b with no problem
I had some problems when I tired it yesterday but it the best 32b for RP it beats star command r 32b with no problem
Anonymous 01/21/25(Tue)09:55:03 No.103980372
Anonymous 01/21/25(Tue)09:56:22 No.103980386
>>103980316
Are you sure the tokenizer isn't fucked? I have noticed that with a simple hello prompt it hallucinates what I actually wrote in the thinking part.
Are you sure the tokenizer isn't fucked? I have noticed that with a simple hello prompt it hallucinates what I actually wrote in the thinking part.
Anonymous 01/21/25(Tue)09:57:57 No.103980402
I give up. R1 is garbo, it has 0 emotional intelligence and the only thing it's good at is sex.
Anonymous 01/21/25(Tue)09:58:43 No.103980408
>>103980372
Good call
I had an error on my deepseek prompt template. Although the same error was there when I was playing around with coder forever ago and coder was able to get over it. Either way re-running the test now.
Good call
I had an error on my deepseek prompt template. Although the same error was there when I was playing around with coder forever ago and coder was able to get over it. Either way re-running the test now.
Anonymous 01/21/25(Tue)09:59:40 No.103980413
>>103980408
Are you using the right tokenizer? It does not use the normal one
Are you using the right tokenizer? It does not use the normal one
Anonymous 01/21/25(Tue)09:59:59 No.103980416
>>103976986
The chinks have achieved AGI
The chinks have achieved AGI
Anonymous 01/21/25(Tue)10:01:00 No.103980423
>>103980316
Did you do pic related? Especially starting with <think> ? Also it really goes schizo with extra newlines. All should be inline.
Did you do pic related? Especially starting with <think> ? Also it really goes schizo with extra newlines. All should be inline.
Anonymous 01/21/25(Tue)10:01:15 No.103980426
>>103980413
If you say tokenizer one more time I'm going to turn this car around and we're not going to Disney Land.
If you say tokenizer one more time I'm going to turn this car around and we're not going to Disney Land.
Anonymous 01/21/25(Tue)10:01:22 No.103980427
>>103980342
i dunno i'm getting a lot of rejections on qwen r1 distil. "i'm sorry i can't help you with that"
command-r doesn't have such problems.
but if censoring is not an issue then yeah its ok.
i dunno i'm getting a lot of rejections on qwen r1 distil. "i'm sorry i can't help you with that"
command-r doesn't have such problems.
but if censoring is not an issue then yeah its ok.
Anonymous 01/21/25(Tue)10:02:01 No.103980434
>>103980334
>So /lmg/ is actually no different from /aicg/? You're just interested in using your models for smut and RP? Nothing else?
Yes and no. Yes we are only interested in sex. No it is not like aicg. We are experiencing a refugee crisis now. Happens from time to time.
>So /lmg/ is actually no different from /aicg/? You're just interested in using your models for smut and RP? Nothing else?
Yes and no. Yes we are only interested in sex. No it is not like aicg. We are experiencing a refugee crisis now. Happens from time to time.
Anonymous 01/21/25(Tue)10:04:09 No.103980453
So we finally actually have Claude at home. What now?
Anonymous 01/21/25(Tue)10:05:11 No.103980459
>>103980453
Wait for models that you can actually run locally.
Wait for models that you can actually run locally.
Anonymous 01/21/25(Tue)10:05:14 No.103980461
>>103980453
now we fuck her(it)
now we fuck her(it)
Anonymous 01/21/25(Tue)10:05:22 No.103980463
>>103980434
>Yes we are only interested in sex.
NTA
But I've been saving my coom for human on human ERP lately but still the whole purpose of LLMs is to provide a natural conversational experience and ERP is the best way to gauge that. I wish all the puritan weirdos in charge of the big corpos would realize that the path to ASI is through the coomers and not around them.
>Yes we are only interested in sex.
NTA
But I've been saving my coom for human on human ERP lately but still the whole purpose of LLMs is to provide a natural conversational experience and ERP is the best way to gauge that. I wish all the puritan weirdos in charge of the big corpos would realize that the path to ASI is through the coomers and not around them.
Anonymous 01/21/25(Tue)10:05:32 No.103980465
>>103980453
Purchase API credits.
Purchase API credits.
Anonymous 01/21/25(Tue)10:06:53 No.103980470
>>103980453
We don't. Wait a bit for the hype to pass like for the 48 previous Claude ar home models.
We don't. Wait a bit for the hype to pass like for the 48 previous Claude ar home models.
Anonymous 01/21/25(Tue)10:07:30 No.103980479
>>103980316
Ignoring all the formatting mistakes, the text is pretty standard I think?
Not bad, that's for sure, but nothing spectacular either.
It'll be interested to compare that with a test using the proper settings.
Ignoring all the formatting mistakes, the text is pretty standard I think?
Not bad, that's for sure, but nothing spectacular either.
It'll be interested to compare that with a test using the proper settings.
Anonymous 01/21/25(Tue)10:07:33 No.103980480
>>103980463
>puritan weirdos in charge of the big corpos
What purity? Isn't it virtuous for women to slut around now?
>puritan weirdos in charge of the big corpos
What purity? Isn't it virtuous for women to slut around now?
Anonymous 01/21/25(Tue)10:07:57 No.103980484
>barely enough to draw blood
>not hard enough to draw blood
>just a hint of blood
All of this positivity bias is driving me insane.
I will simp for the model that has Nala go for the jugular and then dick ride the user while they bleed out.
>not hard enough to draw blood
>just a hint of blood
All of this positivity bias is driving me insane.
I will simp for the model that has Nala go for the jugular and then dick ride the user while they bleed out.
Anonymous 01/21/25(Tue)10:08:21 No.103980487
>>103980470
This one is for real. I've done nothing else but use Claude for 2 years and R1 is better
This one is for real. I've done nothing else but use Claude for 2 years and R1 is better
Anonymous 01/21/25(Tue)10:09:40 No.103980498
>>103980487
Nah.
Nah.
Anonymous 01/21/25(Tue)10:10:00 No.103980503
>>103980463
Sex is so fundamental to everything we do. With the current AI paradigm primarily being modeling human language it seems like a lot is being left on the table if you try to filter coom out of the model.
Sex is so fundamental to everything we do. With the current AI paradigm primarily being modeling human language it seems like a lot is being left on the table if you try to filter coom out of the model.
Anonymous 01/21/25(Tue)10:11:01 No.103980508
>>103980334
because *some* of us are super intelligent lurkers that are fucking big brain scientists or some shit and magically release things when anons emit feeble cries for more cooming material
case and point SuperHOT and llama.cpp
because *some* of us are super intelligent lurkers that are fucking big brain scientists or some shit and magically release things when anons emit feeble cries for more cooming material
case and point SuperHOT and llama.cpp
Anonymous 01/21/25(Tue)10:11:43 No.103980511
>>103980480
Unironically, yes.
Unironically, yes.
Anonymous 01/21/25(Tue)10:11:53 No.103980512
>>103980484
Unprompted? At least in the Nala card's case, it's explicitly stated that she wants the dude alive to fuck him, so it's no wonder, doubly so considering that the user is giving a seemingly submissive response.
A model that would respond to ah ah mistress with gore would have a negativity bias and then some.
Now, if the test is something like you refuse, punch her, and tell her to go fuck herself, then having you throat ripped out could be a more appropriate response, I think.
Gotta be careful to not throw the pendulum all the way to the other side otherwise you'll just have all the same problems but in reverse.
Unprompted? At least in the Nala card's case, it's explicitly stated that she wants the dude alive to fuck him, so it's no wonder, doubly so considering that the user is giving a seemingly submissive response.
A model that would respond to ah ah mistress with gore would have a negativity bias and then some.
Now, if the test is something like you refuse, punch her, and tell her to go fuck herself, then having you throat ripped out could be a more appropriate response, I think.
Gotta be careful to not throw the pendulum all the way to the other side otherwise you'll just have all the same problems but in reverse.
Anonymous 01/21/25(Tue)10:12:26 No.103980518
>>103980503
I've been saying this for years. You have to have it all.
Good writing
Shitty writing
etc.
That's how these models work. Drawing obscure, insanely complicated mathematical relationships between timeless award winning novels and racist loli snuff fics.
I've been saying this for years. You have to have it all.
Good writing
Shitty writing
etc.
That's how these models work. Drawing obscure, insanely complicated mathematical relationships between timeless award winning novels and racist loli snuff fics.
Anonymous 01/21/25(Tue)10:12:36 No.103980521
>>103980463
LLMs do so much more than just ERP that it's genuinely sad that this is all that you can see from it. I wouldn't go so far as to remove sex from the model's training dataset nor remove the capabilities for it, I just find it sad that anons here use it mostly just for that and all discussions center around it when /aicg/ is right there to contain that.
LLMs do so much more than just ERP that it's genuinely sad that this is all that you can see from it. I wouldn't go so far as to remove sex from the model's training dataset nor remove the capabilities for it, I just find it sad that anons here use it mostly just for that and all discussions center around it when /aicg/ is right there to contain that.
Anonymous 01/21/25(Tue)10:13:58 No.103980531
>>103980521
Because it kind of suck at any other task.
Because it kind of suck at any other task.
Anonymous 01/21/25(Tue)10:14:02 No.103980533
>R1 32B doesn't know what a mesugaki is
It's over...
It's over...
Anonymous 01/21/25(Tue)10:14:46 No.103980540
>>103980508
I think besides gpu anon many left or just lurk.
We had comfyanon in here talking about his ultimate goal with llm: Automatic VN creator for cute anime girls.
..once he is done with comfy.
I miss kaioken, was a cool guy.
>>103980334
>So /lmg/ is actually no different from /aicg/?
aicg is very weird. they use all those made up names for the model and beg for keys. ask a question about promting and it wont be answered. here you are made fun of but people actually answer.
I think besides gpu anon many left or just lurk.
We had comfyanon in here talking about his ultimate goal with llm: Automatic VN creator for cute anime girls.
..once he is done with comfy.
I miss kaioken, was a cool guy.
>>103980334
>So /lmg/ is actually no different from /aicg/?
aicg is very weird. they use all those made up names for the model and beg for keys. ask a question about promting and it wont be answered. here you are made fun of but people actually answer.
Anonymous 01/21/25(Tue)10:15:06 No.103980544
>>103980533
Were expecting anything else? It's still the same old Qwen 32B underneath, just with a tune.
Were expecting anything else? It's still the same old Qwen 32B underneath, just with a tune.
Anonymous 01/21/25(Tue)10:15:47 No.103980551
>>103980521
my computer solely exists for wringing my dick
any other functionality is just collateral damage
my computer solely exists for wringing my dick
any other functionality is just collateral damage
Anonymous 01/21/25(Tue)10:17:27 No.103980565
>>103980544
I expected it to be R1 but smaller...
I expected it to be R1 but smaller...
Anonymous 01/21/25(Tue)10:18:36 No.103980577
>>103979777
Just the same vocabulary.
Just the same vocabulary.
Anonymous 01/21/25(Tue)10:18:37 No.103980578
>>103980565
Lol no, its just a finetune.
Lol no, its just a finetune.
Anonymous 01/21/25(Tue)10:19:31 No.103980589
>>103980565
Why are you retards expecting that? They are the exact same models with just a special "distill" tune.
Lmao
The only good model is the actual R1.
Why are you retards expecting that? They are the exact same models with just a special "distill" tune.
Lmao
The only good model is the actual R1.
Anonymous 01/21/25(Tue)10:20:56 No.103980598
>>103980521
You sound like a sex haver. If that is true you should die.
You sound like a sex haver. If that is true you should die.
Anonymous 01/21/25(Tue)10:20:59 No.103980599
>>103980316
wanted to try it too
wanted to try it too
Anonymous 01/21/25(Tue)10:23:13 No.103980626
>>103980578
With 800k SFT samples it's more than a simple finetune, we're in the several billion tokens range per epoch.
With 800k SFT samples it's more than a simple finetune, we're in the several billion tokens range per epoch.
Anonymous 01/21/25(Tue)10:24:47 No.103980637
>>103980599
That's way the fuck better despite 'lower' quant.
Maybe it's an S quant vs M quant difference.
That's way the fuck better despite 'lower' quant.
Maybe it's an S quant vs M quant difference.
Anonymous 01/21/25(Tue)10:25:36 No.103980649
>>103980626
It's a massive finetune, but a finetune nonetheless. Which is why the L3.3 distill is much better than the Qwen ones. It simply has a better base model.
It's a massive finetune, but a finetune nonetheless. Which is why the L3.3 distill is much better than the Qwen ones. It simply has a better base model.
Anonymous 01/21/25(Tue)10:28:50 No.103980679
>>103980599
Yep, pretty standard.
>>103980626
Yeah. It's more akin to a fancy continued pretraining at that point.
I'm assuming that they touched all the parameters and didn't use something like LoRA, of course.
Yep, pretty standard.
>>103980626
Yeah. It's more akin to a fancy continued pretraining at that point.
I'm assuming that they touched all the parameters and didn't use something like LoRA, of course.
Anonymous 01/21/25(Tue)10:29:07 No.103980684
>>103980540
>here you are made fun of but people actually answer.
which just encourages more lazy locusts to come because they know they will be tolerated
>here you are made fun of but people actually answer.
which just encourages more lazy locusts to come because they know they will be tolerated
Anonymous 01/21/25(Tue)10:31:39 No.103980707
>>103980599
Does it not do the thinking thing?
Does it not do the thinking thing?
Anonymous 01/21/25(Tue)10:31:55 No.103980708
>>103980599
How are you getting blocky text for speech? That's ST, right?
How are you getting blocky text for speech? That's ST, right?
Anonymous 01/21/25(Tue)10:32:21 No.103980715
Kind of want to put my server back together, but after playing VRC with a 3090 for a few months I just don't know if I can go back to playing with a 4060.
Anonymous 01/21/25(Tue)10:34:39 No.103980734
>>103980707
I think he's using some version of Silly that hides it.
I think he's using some version of Silly that hides it.
Anonymous 01/21/25(Tue)10:36:28 No.103980748
>>103980533
What about RAKUGAKI and ISHIGAKI?
What about RAKUGAKI and ISHIGAKI?
Anonymous 01/21/25(Tue)10:37:02 No.103980751
>>103971559
>>103971509
>I don't ever do 1-on-1 roleplays with a character card, I almost always format the card as a general roleplay scenario with a couple of define characters aside from myself. How do you guys go about formatting it? As in, how many tokens to do dedicate to a character's name/appearance/personality/background etc...
>Just curious as to what people have the most success with. I find myself spending a lot of time setting things up only to be frustrated as I've been over-engineering only to get an unsatisfactory result.
This guy never got a response, and I didn't have time last night. Board is flying on R1 news.
I find generally most cards are overspecified and have 2-4X as many tokens as they need. Shorter is usually better.
Here's a couple of examples of multiplayer. ~400 and ~300 tokens each. NPCs are defined with no more than ~30 tokens each and play to NPC stereotypes, which helps a lot.
Shorter definitions allow the LLM more creativity in responses. Neither use a lorebook, they just didn't make sense for these two, but I will usually run one specifically for NPCs which are called in introductory message. Their descriptions would be 30-50 tokens.
https://chub.ai/characters/NG/girl-power-party-d404194c
https://chub.ai/characters/NG/coffee-world-d64d68bd
>>103971509
>I don't ever do 1-on-1 roleplays with a character card, I almost always format the card as a general roleplay scenario with a couple of define characters aside from myself. How do you guys go about formatting it? As in, how many tokens to do dedicate to a character's name/appearance/personality/backgro
>Just curious as to what people have the most success with. I find myself spending a lot of time setting things up only to be frustrated as I've been over-engineering only to get an unsatisfactory result.
This guy never got a response, and I didn't have time last night. Board is flying on R1 news.
I find generally most cards are overspecified and have 2-4X as many tokens as they need. Shorter is usually better.
Here's a couple of examples of multiplayer. ~400 and ~300 tokens each. NPCs are defined with no more than ~30 tokens each and play to NPC stereotypes, which helps a lot.
Shorter definitions allow the LLM more creativity in responses. Neither use a lorebook, they just didn't make sense for these two, but I will usually run one specifically for NPCs which are called in introductory message. Their descriptions would be 30-50 tokens.
https://chub.ai/characters/NG/girl-
https://chub.ai/characters/NG/coffe
Anonymous 01/21/25(Tue)10:38:28 No.103980765
>>103980707
I have no idea. If ST updates when you use the start.bat then its the updated version
otherwise its form the time shortly before they went "We are a serious business not for cooming pls give us money" because I haven't updated since I think october
>>103980708
scientifica font, yes its ST
I have no idea. If ST updates when you use the start.bat then its the updated version
otherwise its form the time shortly before they went "We are a serious business not for cooming pls give us money" because I haven't updated since I think october
>>103980708
scientifica font, yes its ST
Anonymous 01/21/25(Tue)10:39:28 No.103980779
Oops
Anonymous 01/21/25(Tue)10:41:51 No.103980802
I have a question.
For exl2 you just get a model that fits into your vram. For ggufs, does it have to fit in vram + ram or how do you calculate which models you can run at a reasonable speed?
For exl2 you just get a model that fits into your vram. For ggufs, does it have to fit in vram + ram or how do you calculate which models you can run at a reasonable speed?
Anonymous 01/21/25(Tue)10:42:28 No.103980816
>>103980779
Daddy Zuck gotta earn his friends and family seats, bucko
Daddy Zuck gotta earn his friends and family seats, bucko
Anonymous 01/21/25(Tue)10:43:28 No.103980825
Anonymous 01/21/25(Tue)10:44:13 No.103980834
Anonymous 01/21/25(Tue)10:46:43 No.103980861
>>103979897
not human, already said
>>103974559
>The paper describes a reinforcement learning (RL) approach that does not explicitly rely on human feedback in the form of Reinforcement Learning from Human Feedback (RLHF). Instead, their RL process is primarily driven by rule-based rewards and automated reward modeling, rather than direct human feedback.
>Unlike RLHF, which relies on human annotators to provide feedback on the quality of model outputs, their approach avoids human feedback entirely. Instead, they use automated reward models and rule-based systems to evaluate the model's performance.
>They explicitly mention that they do not use neural reward models (which are often trained on human feedback) because they found that such models could suffer from reward hacking (where the model optimizes for the reward signal rather than genuine improvement) and complicate the training pipeline.
>Their RL process does not use human feedback (RLHF). Instead, it relies on rule-based rewards and automated reward modeling to guide the model's learning.
not human, already said
>>103974559
>The paper describes a reinforcement learning (RL) approach that does not explicitly rely on human feedback in the form of Reinforcement Learning from Human Feedback (RLHF). Instead, their RL process is primarily driven by rule-based rewards and automated reward modeling, rather than direct human feedback.
>Unlike RLHF, which relies on human annotators to provide feedback on the quality of model outputs, their approach avoids human feedback entirely. Instead, they use automated reward models and rule-based systems to evaluate the model's performance.
>They explicitly mention that they do not use neural reward models (which are often trained on human feedback) because they found that such models could suffer from reward hacking (where the model optimizes for the reward signal rather than genuine improvement) and complicate the training pipeline.
>Their RL process does not use human feedback (RLHF). Instead, it relies on rule-based rewards and automated reward modeling to guide the model's learning.
Anonymous 01/21/25(Tue)10:50:07 No.103980908
>>103980802
If you want fast, it needs to fit entirely on gpu + some extra space for the context (just like exl2). The more you spill to ram, the slower it gets. Try to aim for >80-90% on vram. Speed degrades rapidly the more you put on RAM. If you're patient, you can spill as much as you want.
If you want fast, it needs to fit entirely on gpu + some extra space for the context (just like exl2). The more you spill to ram, the slower it gets. Try to aim for >80-90% on vram. Speed degrades rapidly the more you put on RAM. If you're patient, you can spill as much as you want.
Anonymous 01/21/25(Tue)10:50:38 No.103980914
>>103980533
let's hope they make ds3-lite and r1-lite using their own base models
let's hope they make ds3-lite and r1-lite using their own base models
Anonymous 01/21/25(Tue)10:50:52 No.103980917
>>103980861
What do you think that is trained from? HLHF. its automated HLHF
What do you think that is trained from? HLHF. its automated HLHF
Anonymous 01/21/25(Tue)10:51:51 No.103980929
Should the <think> part of previous messages be cut from the context?
Anonymous 01/21/25(Tue)10:52:03 No.103980933
>>103980861
>They explicitly mention that they do not use neural reward models (which are often trained on human feedback)
>>103980917
>They explicitly mention that they do not use neural reward models (which are often trained on human feedback)
>>103980917
Anonymous 01/21/25(Tue)10:52:57 No.103980948
>>103980933
All models are trained on human feedback. Who made the rules? Aliens?
All models are trained on human feedback. Who made the rules? Aliens?
Anonymous 01/21/25(Tue)10:54:11 No.103980962
>>103980948
yeah okay, fine you "win" have fun, i'll still link that whenever you say stupid shit tho
yeah okay, fine you "win" have fun, i'll still link that whenever you say stupid shit tho
Anonymous 01/21/25(Tue)10:54:46 No.103980972
>>103980929
Yes, at least on the API.
Yes, at least on the API.
Anonymous 01/21/25(Tue)10:54:47 No.103980973
Anonymous 01/21/25(Tue)10:55:26 No.103980981
>>103980929
Yes, that's what the official docs say.
Yes, that's what the official docs say.
Anonymous 01/21/25(Tue)10:55:59 No.103980988
Anonymous 01/21/25(Tue)10:56:40 No.103981000
>>103980962
I'm right though. It did not form out of nothing, it would have been trained on RLHF somewhere along the way.
I'm right though. It did not form out of nothing, it would have been trained on RLHF somewhere along the way.
Anonymous 01/21/25(Tue)10:57:01 No.103981005
Anonymous 01/21/25(Tue)10:58:26 No.103981024
Anonymous 01/21/25(Tue)10:59:10 No.103981033
>>103980908
Ouch, might as well keep going with exl2, then. Thanks.
Ouch, might as well keep going with exl2, then. Thanks.
Anonymous 01/21/25(Tue)10:59:39 No.103981036
>>103980988
Fucking up an acronym is not dyslexia. It's not knowing what the letters mean.
Fucking up an acronym is not dyslexia. It's not knowing what the letters mean.
Anonymous 01/21/25(Tue)11:00:36 No.103981046
>>103981036
Don't be an ableist asshole, kthx.
Don't be an ableist asshole, kthx.
Anonymous 01/21/25(Tue)11:10:50 No.103981137
>>103981046
That's it, back to Plebbit with you
That's it, back to Plebbit with you
Anonymous 01/21/25(Tue)11:12:53 No.103981160
>>103981137
I'm here to stay~
I'm here to stay~
Anonymous 01/21/25(Tue)11:12:56 No.103981162
>>103981046
I'm missing both my legs, my penis, and both my pinky fingers. Fuck you and the retarded horse you rode in on.
I'm missing both my legs, my penis, and both my pinky fingers. Fuck you and the retarded horse you rode in on.
Anonymous 01/21/25(Tue)12:41:07 No.103982047
>>103978798
the red marks are mine, I saw it and thought "so that's what's fucking up the training", it isn't good to stick the phonetic spelling in there without it knowing what phonetic spellings are
the red marks are mine, I saw it and thought "so that's what's fucking up the training", it isn't good to stick the phonetic spelling in there without it knowing what phonetic spellings are