4combinator

/lmg/ - Local Models General

Anonymous 01/14/25(Tue)04:26:03 | 450 comments | 44 images | 🔒 Locked

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103881688 & >>103871751

►News
>(01/14) MiniCPM-o 2.6 released with multi-image and video understanding, realtime speech conversation, voice cloning, and multimodal live streaming: https://hf.co/openbmb/MiniCPM-o-2_6
>(01/08) Phi-4 weights released: https://hf.co/microsoft/phi-4
>(01/06) NVIDIA Project DIGITS announced, capable of running 200B models: https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous 01/14/25(Tue)04:26:33 No.103888594

►Recent Highlights from the Previous Thread: >>103881688

--Paper: SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training:
>103886784 >103886926
--Paper: Transformer^2: Self-adaptive LLMs:
>103886931 >103887964
--Papers:
>103886689 >103886794 >103887023
--Test-time compute storytelling and the role of model size in creative writing:
>103884801 >103884814 >103884855 >103884873 >103884914 >103885026 >103885107
--Relationship between model size and quantization sensitivity discussed:
>103881791 >103881850 >103881898 >103881964 >103882008 >103882183
--Discussion of DIGITS and PC building options, with a focus on memory bandwidth and performance:
>103883782 >103883785 >103883880 >103883904 >103883934 >103883963 >103883986 >103884107 >103884142 >103884150 >103884596 >103883999 >103884009 >103884136 >103883825
--Discussion of Mac and DIGITS systems, memory, and GPU capabilities:
>103882515 >103882577 >103882642 >103882740 >103882781 >103882884 >103883056 >103883116 >103883142 >103883315
--Discussion of AI models, GPU performance, and optimization strategies:
>103884724 >103884770 >103885024 >103886084 >103886381 >103886954 >103886979 >103886993 >103887024 >103887252 >103887019 >103887037 >103887132
--Speculation about Nvidia's mysterious repository on Hugging Face:
>103884597 >103884660 >103884687 >103884712 >103885202 >103884910
--FP8 vs Q8: data types, precision, and information loss:
>103883157 >103883204 >103883239 >103883245 >103883249
--Phi vs Llama for finetuning discussion:
>103884786 >103884861 >103884972 >103885004 >103885793
--Anon shares anonymous-chatbot response explaining lolilibaba concept:
>103883568 >103883604 >103884750
--UGI Leaderboard evaluates language models' ideological leaning and neutrality:
>103883290 >103883625
--Miku (free space):
>103883015 >103884150 >103884327 >103884720 >103886919 >103887221

►Recent Highlight Posts from the Previous Thread: >>103881693

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 01/14/25(Tue)04:29:48 No.103888618

>Let's try this "MiniCPM"
>1 hour later
>Still building flash attention wheel

Yes I installed ninja

Anonymous 01/14/25(Tue)04:36:39 No.103888658

Tetolove

Anonymous 01/14/25(Tue)04:47:24 No.103888709

>>103888594
I love you, Recap Teto.

Anonymous 01/14/25(Tue)05:05:15 No.103888840

are any of these good enough to use with a frontend ai chatbot with reasonable speed? I have a 3090, I'm guessing most of you do something similar? I last tried llama3 and it was impressive for running locally and free but fairly bad compared to chatgpt and other online models

Anonymous 01/14/25(Tue)05:12:03 No.103888885

>>103888840

Local models are usually all caught up to SaaS in at least one area, but there aren't really any models that accel in all areas like a lot of the corpo models do. You gotta pick and choose a model for your niche. Llama 3 is kind of the exception there in that it's aggressively mediocre at everything

Anonymous 01/14/25(Tue)05:12:04 No.103888886

what would you use if you had two 3060 (12 GB each) and 32 GB VRAM? for either RP or other things

>inb4 jokes about how the rig is shit, I guess

Anonymous 01/14/25(Tue)05:13:26 No.103888893

>>103888886
A Qwen 32b based model at Q5 or something probably. It's not the worst ending.

Anonymous 01/14/25(Tue)05:16:12 No.103888919

>>103888886
>and 32 GB VRAM
Nice 5090 + 2x 3060 setup.
On a more serious note, I'd go with Cydonia first. See if you like it.

Anonymous 01/14/25(Tue)05:16:55 No.103888923

>>103888885
damn, almost neato digits & thnks for the info, Ill investigate more tomorrow. all the gooners in aicg dying over jailbreaks and proxies when lmg might be the answer

Anonymous 01/14/25(Tue)05:55:21 No.103889140

Are there models that can translate as in are those specialized or any model can? I'm not really finding anything atm(found something from 2 year ago though)

Anonymous 01/14/25(Tue)06:09:44 No.103889230

bitnet millions of experts 70b when?

Anonymous 01/14/25(Tue)06:11:22 No.103889242

>>103889221
By being attractive.

Anonymous 01/14/25(Tue)06:12:50 No.103889251

Anonymous 01/14/25(Tue)06:22:17 No.103889317

>>103889251
*audibly pops the magic bubble*

Anonymous 01/14/25(Tue)06:24:04 No.103889333

>>103889317
GLGLLUGLLGLRHH

Anonymous 01/14/25(Tue)06:31:51 No.103889378

>>103889333
Noooo :(

Anonymous 01/14/25(Tue)06:40:41 No.103889448

>>103888886
Do you mean 32GB RAM? Or 56GB VRAM total?

Anonymous 01/14/25(Tue)06:46:59 No.103889484

>>103888893
examples?

Anonymous 01/14/25(Tue)06:47:46 No.103889489

>>103889140
Take a look at some popular models.
See what languages they have been trained on.
They should be able to pretty-much translate text between those languages.

Anonymous 01/14/25(Tue)07:11:02 No.103889655

>>103888893
>>103888919
Thank you. I'll try them both out.

>>103889448
Yeah, I have 32 additional VRAM that was soldered on by some dude in Shenzhen while I had noodles

Anonymous 01/14/25(Tue)07:13:05 No.103889676

>https://huggingface.co/openbmb/MiniCPM-o-2_6/tree/main
I hate the demo because I get nervous talking with female voices

Anonymous 01/14/25(Tue)07:18:29 No.103889710

https://web.archive.org/web/20250114121236/https://www.theregister.com/2025/01/09/us_weighing_global_limits_ai_exports/
>"Along with compute caps on tier-2 nations, the rules may also include limits on the export of closed AI model weights. Model weights represent the numerical values that dictate how modern AI models function. Under the proposed rules, the Commerce Department aims to prevent companies from hosting closed model weights in tier-3 countries like China and Russia. Such a move would prevent major closed-source models from being served from these nations. Open models, like Meta's Llama 3.1 405B, would not be subject to these rules, nor would any closed model deemed less sophisticated than an existing open model."
>"nor would any closed model deemed less sophisticated than an existing open model."
What are they trying to do here?

Anonymous 01/14/25(Tue)07:28:33 No.103889779

>>103889710
My guess is preventing potentially hostile countries from having privileged access over powerful AI models.

Anonymous 01/14/25(Tue)07:41:35 No.103889859

Titanpill me RIGHT NOW

https://arxiv.org/pdf/2501.00663v1

Anonymous 01/14/25(Tue)07:43:24 No.103889870

>>103889859
>open paper
>see math
>close paper
simple as

Anonymous 01/14/25(Tue)08:00:13 No.103889960

>>103888589
https://youtu.be/OSKgz8NfUoI

Anonymous 01/14/25(Tue)08:01:41 No.103889965

>>103889859
I'm a retard when it comes to math formulas, but at least understand the terminology, and if I get it right, the basic idea is:
Models we use these days function best as an equivalent of short-term memory (which is why they get dumber with longer contexts). This architecture involves basically having a meta-model with additional context: a long-term memory that evaluates how "surprising" or "memorable" something is in its context, and feeds that data to the core model that operates on a smaller context (they keep using the phrase "learning to memorize at test time", but I see nothing to suggest any moving parts, so to speak). In other words, important details should be preserved from a much larger context (they claim it can reach 2M context), influencing the short-term memory, and in turn influencing the output.
It seems theoretically solid. Long-term memory is one of the aspects of LLMs that badly need a breakthrough, and this seems like a viable approach without requiring dynamic data on the user side.

Anonymous 01/14/25(Tue)08:06:50 No.103889997

Are non-dense models anti-local because they tend to require corporate-tier amounts of VRAM and substantially less compute, making them optimal for SaaS deployments?

Anonymous 01/14/25(Tue)08:09:08 No.103890016

>>103889997
no

Anonymous 01/14/25(Tue)08:09:09 No.103890018

>>103889859
Sorry, you said pill, not explain.
Well, if the results can be trusted, this basically cracks the problem of attention dilution over long contexts wide open. Finds the needle in the haystack near-perfectly. You know that important detail you mentioned exactly once in your RP some 10k tokens ago, that gradually got washed out until it was completely forgotten? This solves that problem.

Anonymous 01/14/25(Tue)08:09:23 No.103890019

>>103889960
cftf?

Anonymous 01/14/25(Tue)08:10:37 No.103890023

>>103889997
Not necessarily, Mixtral used to be a VRAMlet friendly model because as long as it can fit in the RAM and the active parameters aren't too many, it can ran at decent speeds

Anonymous 01/14/25(Tue)08:13:03 No.103890038

>>103890018
tl;dr loredumpfags rejoice?

Anonymous 01/14/25(Tue)08:13:52 No.103890043

>>103890023
>responding to the corpo market research bot
bruh

Anonymous 01/14/25(Tue)08:18:57 No.103890079

>>103889997
There are ratios of total size to expert size where it is actually better for local since you can use your regular ram to get a more parameters at a small speed cost.

Anonymous 01/14/25(Tue)08:19:40 No.103890085

I tried getting the the local demo for MiniCPM-o 2.6 working but there were too many problems after the other to make it work well.

Seems solid otherwise. Might make a fun bantz buddy when gaming or something.

Anonymous 01/14/25(Tue)08:20:46 No.103890093

>>103889859
just skimmed it. feels more than a meme paper this time. (he said)

Anonymous 01/14/25(Tue)08:21:19 No.103890096

>>103890019
go back

Anonymous 01/14/25(Tue)08:21:37 No.103890100

>>103889965
>>103890018
>google
See you next year when an open model implements this.

Anonymous 01/14/25(Tue)08:22:10 No.103890105

>>103890023
Oh, I see, thanks for explaining.
>>103890079
Nice.

Anonymous 01/14/25(Tue)08:27:22 No.103890151

I want a model or finetune that understands memes and culture.

Anonymous 01/14/25(Tue)08:29:00 No.103890163

>>103890151
Claude. Heard it can even talk like a zoomer if prompted.

Anonymous 01/14/25(Tue)08:33:00 No.103890199

gemini didn't like my strategy for tsunamis

Anonymous 01/14/25(Tue)08:37:00 No.103890220

>>103890199
>missing the meme and taking it completely literally, at face value instead
At last, artificial autism.

Anonymous 01/14/25(Tue)08:47:54 No.103890295

They should make a RP leaderboard of sorts entirely based on Nala test. It is just such a good test for so many reasons and filter shite

Anonymous 01/14/25(Tue)08:51:38 No.103890322

>>103890295
It's actually a completely retarded meme though.

Anonymous 01/14/25(Tue)08:53:08 No.103890329

>>103890295
I agree.
You evaluate the responses based on a couple of categories such as anatomical understanding, character adherence, etc.
Do 5 swipes for each model and take the average of the best 3 or something.

Anonymous 01/14/25(Tue)09:12:31 No.103890472

>>103888589
cute teto

Anonymous 01/14/25(Tue)09:12:48 No.103890474

>>103890163
A local model.

Anonymous 01/14/25(Tue)09:17:50 No.103890518

>>103890474
DeepSeekV3

Anonymous 01/14/25(Tue)09:24:21 No.103890566

llamiku 4 when

Anonymous 01/14/25(Tue)09:30:23 No.103890626

>>103888589
>MiniCPM-o
Has anyone got this working locally with it's functionality intact? I couldn't get it to work.

Anonymous 01/14/25(Tue)09:46:14 No.103890770

I'm going to miss lmg-anon...

Anonymous 01/14/25(Tue)09:50:56 No.103890815

>>103890770
He'll be back in 3-15 years, no worries. Just in time for llama4.3-70b

Anonymous 01/14/25(Tue)09:55:55 No.103890872

>>103890770
So what did he do exactly? Say a woman should suck his dick cause his PC runs pytorch and a woman actually did it?

Anonymous 01/14/25(Tue)10:00:10 No.103890911

>>103890770
Which tag do I use for the middle one's tit shape?

Anonymous 01/14/25(Tue)10:01:05 No.103890923

Has anyone tried Sky-T1-32B-Preview?
Allegedly it is like qwq but less buggy and better at programming.

Anonymous 01/14/25(Tue)10:01:43 No.103890930

>>103890626
>I couldn't get it to work.
Would you like to be more specific?

Anonymous 01/14/25(Tue)10:08:08 No.103891015

>>103890911
bestiality

Anonymous 01/14/25(Tue)10:09:02 No.103891025

>>103891015
I don't get the joke.

Anonymous 01/14/25(Tue)10:09:38 No.103891030

>>103890923
it's a model fine-tuned on qwq outputs, i doubt it's any better than it

Anonymous 01/14/25(Tue)10:16:13 No.103891090

>>103891030
It should be better but not much better.

Anonymous 01/14/25(Tue)10:19:46 No.103891132

I doubt thats this is even lmg-anon but that statement is not true right? That sounds crazy.

Anonymous 01/14/25(Tue)10:19:59 No.103891137

Updated Silly Tavern on single board computer to include maintanace items, and get it to monitor and re-connect to wifi if dropped.
https://rentry.org/SillyTavernOnSBC

Anonymous 01/14/25(Tue)10:20:56 No.103891147

>>103891132
Wrong pic, meant to post this.

Anonymous 01/14/25(Tue)10:22:41 No.103891170

>>103891147
>having friends
ngmi

Anonymous 01/14/25(Tue)10:26:28 No.103891200

>>103891147
>>103891132
i wouldn't doubt it for a sec. korea is a corrupt hellhole.

Anonymous 01/14/25(Tue)10:30:15 No.103891235

>>103891147
WHO

Anonymous 01/14/25(Tue)10:30:38 No.103891241

>>103891147
Since you can get imprisoned for a few insults on LoL, I'm sure that's true.

Anonymous 01/14/25(Tue)10:30:40 No.103891242

>>103891235
some locust enabler

Anonymous 01/14/25(Tue)10:32:49 No.103891264

>>103888589
Hi bros
Im a bit out of the loop on the memesamplers, can someone spoonfeed me a good value of smooth/dry or xtc for largestral?
normally i dont mess with them, but i remember smooth being kinda nice, and after doing a multicharacter card and the model giving VASTLY more attention to a character with a common name, i think i need to use them

Anonymous 01/14/25(Tue)10:33:46 No.103891273

>>103891147
That's North Korea, right?
Right?

Anonymous 01/14/25(Tue)10:38:55 No.103891318

>>103891273
;)

Anonymous 01/14/25(Tue)10:40:16 No.103891329

>>103891147
The entirety of South Korea will disappear in 3 generations. He'll get the last laugh, might even be alive for it.

Anonymous 01/14/25(Tue)10:40:27 No.103891331

>>103890220
No. Catching the meme is the autism in this scenario, anon. Imagine going up to some rando math professor and being like
>"ehehe... gotta go fast, bet"
He'll think you're retarded. Same with gemini. It's just not allowed to say it.

Anonymous 01/14/25(Tue)10:40:49 No.103891333

>>103890770
>>103891147
I thought I forgot to change tabs.

Anonymous 01/14/25(Tue)10:41:09 No.103891341

>>103891273
Why is it so weird? They probably were accomplices to some degree for whatever he did.

Anonymous 01/14/25(Tue)10:42:07 No.103891353

Kill yourself.

Anonymous 01/14/25(Tue)10:43:20 No.103891367

>>103891333
is that kanken ni-kyu? Impressive.

Anonymous 01/14/25(Tue)10:48:28 No.103891423

>>103891333
im jealous

Anonymous 01/14/25(Tue)10:49:31 No.103891433

t-thanks grok. cant wait to have that power locally soon.

Anonymous 01/14/25(Tue)10:52:58 No.103891478

>In South Korea, defamation laws are particularly stringent, as evidenced by the information provided in the related web results.
>Defamation can lead to criminal charges with potential imprisonment up to three years if the information is true, and up to seven years if it is false.
>This is highlighted in the context of South Korean cyber defamation law, where even true information that harms a person's reputation can be punishable.
>This legal framework is different from many Western countries where defamation typically results in civil rather than criminal liability.
>Specific Allegations: According to Air Katakana's subsequent posts, the person who reported him to the police is a professor at KAIST named ******* Lee.
>Air Katakana alleges that this professor forced him to send a large amount of money to his wife under the threat of revoking a job offer that had been promised over a year prior.
>This suggests that the defamation might be tied to these financial and employment-related disputes.

>imprisonment up to three years if the information is true
I seriously hope that's just hallucinated up. 3 years for posting true stuff somebody doesn't like. wow

Anonymous 01/14/25(Tue)10:59:10 No.103891569

>>103891478
This is something air katakana wrote in one of his tweets, I think it was irony and the AI took it as a fact.

Anonymous 01/14/25(Tue)11:00:45 No.103891593

>>103891569
It's on wikipedia

>>103891478
I'd assume that's to discourage gossip and feuds about petty stuff, but it's ridiculously phrased.

Anonymous 01/14/25(Tue)11:03:07 No.103891619

>>103891593
>ridiculously phrased.
Ridiculously implemented, rather...

Anonymous 01/14/25(Tue)11:07:23 No.103891662

>>103890626
the vision model works ok for me

Anonymous 01/14/25(Tue)11:08:04 No.103891668

>>103891137
>https://rentry.org/SillyTavernOnSBC
Nice, but it needs a miku pic anchor

Anonymous 01/14/25(Tue)11:11:23 No.103891698

>>103890518
A local model I can run. I'm not a poorfag either. I have 64gb ddr5.

Anonymous 01/14/25(Tue)11:13:03 No.103891721

>>103891698
"understand memes and culture" is too broad. what do you want to do exactly?

Anonymous 01/14/25(Tue)11:15:47 No.103891754

>>103891478
You have to understand that all of Korea was nothing by illiterate farmers just a couple generations ago. Unlike the Japanese and, at least the historically urban parts of, the Chinese, they don't really have a tradition of a stable and functioning modern civilization.

Anonymous 01/14/25(Tue)11:20:22 No.103891802

>>103891721
If you have to ask, your judgement won't be useful to me. I can tell based on the tone of your post that you're a pedantic shithead.

Anonymous 01/14/25(Tue)11:24:28 No.103891840

>>103891802
nta, but jesus fuck, anon...
>being pedantic about memes and "culture"
>I'm not a poorfag either.
>I have 64gb ddr5.
>pedantic shithead.

Anonymous 01/14/25(Tue)11:25:05 No.103891846

>>103891132
>I doubt thats this is even lmg-anon
he seems like a pretty cool guy. Does he actually hang out here? We could swap JLPT stories

Anonymous 01/14/25(Tue)11:26:30 No.103891859

>>103891698
>I'm not a poorfag either. I have 64gb ddr5.
gigacope

Anonymous 01/14/25(Tue)11:27:17 No.103891869

>>103891668
You're right.

Anonymous 01/14/25(Tue)11:31:31 No.103891904

How exacly one start with this? I`m on linux.
From the rentry tutorial it says to download oobabooga, their github says to start_linux.sh.
Then I went to https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/
But couldn`t find a download link to any project. What after this?

Anonymous 01/14/25(Tue)11:34:32 No.103891932

>>103891904
By waiting 2 more weeks for better models.

Anonymous 01/14/25(Tue)11:34:34 No.103891933

>>103891904
copy-paste the name of the repo into the model download box in ooba. after that, load the model and done

Anonymous 01/14/25(Tue)11:34:47 No.103891935

>>103891802
are you a toddler that expects to be spoonfed with minimal communication (crying)?

Anonymous 01/14/25(Tue)11:44:44 No.103892037

>>103891904
>memeboard leaders these days are 78(!)B frankenmerges of Qwen-72B
I hadn't looked at that cesspool in a year. Good to know it hasn't changed since then after it got flooded by chinks and indians training meme models on benchmark data.

Anonymous 01/14/25(Tue)11:49:07 No.103892080

>>103892037
we need better benchmarks

Anonymous 01/14/25(Tue)12:04:55 No.103892241

how are speeds benchmarked anyway?
T/s just refers to output speed right.
Is there a factor x where processing input is faster as generating tokens or are these two unrelated. Also speeds seem to differ quite a bit by not only size but model and datatype.

Anonymous 01/14/25(Tue)12:06:30 No.103892252

>>103892241
Most backends show you processing, generation and the total time each in t/s.

Anonymous 01/14/25(Tue)12:14:51 No.103892328

>>103892037
>>103892080
open-llm-leaderboard evaluates on non-CoT. Meaning, they don't let the model generate a full solution, and then search through it and extract the answer, but rather directly check the probability of the answer in the first few tokens. That's why qwen2.5-72b scores higher than qwen2.5-72b-instruct, even though instruct is a much better assistant (which this benchmark is trying to evaluate).
Someone correct me if I'm wrong.

Anonymous 01/14/25(Tue)12:30:04 No.103892479

I'm following the "getting started" guide. It's telling me to "download nemo 12b instruct gguf". Where do I find this?

Anonymous 01/14/25(Tue)12:31:28 No.103892504

>>103892479
Nigga, this isn't spoonfeeding at this point, it's giving you knowledge in a fucking IV line.

Anonymous 01/14/25(Tue)12:33:46 No.103892534

>>103886370
>deepseek repeats too much
using --chat-template deepseek3 with a recent llama.cpp has eliminated any repeating for me. I don't think I've seen it once.
The only other variable is that I self-quant, so I can't speak for any online ggufs if those are the core issue

Anonymous 01/14/25(Tue)12:33:48 No.103892535

>>103892479
>the treasure map says X marks the spot, where do I find this so I can start digging?
>[Attached picture: big X on the ground]

Anonymous 01/14/25(Tue)12:35:08 No.103892551

>>103892479
In the off chance you aren't trolling, have you tried writing "nemo 12b instruct gguf" in the search bar that's clearly visible in your image?
If not, try that.
You'll see a bunch of results, probably, download the bartowski one that has mistral in the name.

Anonymous 01/14/25(Tue)12:36:57 No.103892569

>>103892551
Doesn't "Vikhr" indicate that it's Russian?

Anonymous 01/14/25(Tue)12:37:10 No.103892574

>>103892479
Use this https://ollama.com/

Anonymous 01/14/25(Tue)12:38:48 No.103892598

>>103892574
This is like giving someone a crackpipe.

Anonymous 01/14/25(Tue)12:38:55 No.103892599

>>103892569
The one with mistral in the name anon.

Anonymous 01/14/25(Tue)12:39:03 No.103892600

>>103892574
>guaranteed_replies.png

Anonymous 01/14/25(Tue)12:41:00 No.103892625

>>103892574
This is the one time recommending ollama is ok

Anonymous 01/14/25(Tue)12:47:49 No.103892697

anyone ever use the cpu pinning feature (not --numa) in lcpp? it doesn't appear to obey the strict flag or pinning mask at all and my inference just gets slower, which doesn't make sense to me.

Anonymous 01/14/25(Tue)12:49:52 No.103892714

>>103888618
>>Still building flash attention wheel
It do be like that.

Anonymous 01/14/25(Tue)12:54:03 No.103892757

Did they cheat with Sky T1? Or is it actually comparable to o1?

Anonymous 01/14/25(Tue)12:55:46 No.103892779

Anonymous 01/14/25(Tue)12:57:34 No.103892808

>>103890518
Ok, now one that can do it for more than once message (since deepseek will come up with something good but then just repeat parts of it every message forever.)

Anonymous 01/14/25(Tue)12:58:43 No.103892824

dead on arrival
doesn't even know what a migu is.

Anonymous 01/14/25(Tue)12:59:38 No.103892841

>>103890295
Once you make something a benchmark then it becomes useless since they'll optimize for it and the test isn't representative of true performance.

Anonymous 01/14/25(Tue)13:00:14 No.103892849

>>103892824
Not exactly a fair test being out of focus and off-model.

Anonymous 01/14/25(Tue)13:00:38 No.103892854

>>103892808
>deepseek will come up with something good but then just repeat parts of it every message forever.
you're doing something wrong

Anonymous 01/14/25(Tue)13:00:55 No.103892858

>>103892534
post a log with 5 bot messages

Anonymous 01/14/25(Tue)13:01:53 No.103892870

>>103892858
>bot messages
wtf does that mean?

Anonymous 01/14/25(Tue)13:01:59 No.103892871

>>103892824
There's only so much a benchmaxxed 8B can do.

Anonymous 01/14/25(Tue)13:02:29 No.103892880

>>103892854
Don't think so, Chinese models just have a lot of shills. People said qwen was good too and I didn't like it either.

Anonymous 01/14/25(Tue)13:02:39 No.103892882

>>103892870
assistant messages

Anonymous 01/14/25(Tue)13:04:05 No.103892905

>>103892849
The whole purpose of machine learning is to create emergent, out of distribution capabilities. It's a perfectly fair test. >>103892871
Still pretty impressive level of understanding for an 8b

Anonymous 01/14/25(Tue)13:04:30 No.103892917

>>103892824
Knowing more obscure stuff is where more params come in.

SnusGoose 01/14/25(Tue)13:05:19 No.103892930

A new model is out, its 45A450B model.

https://www.minimaxi.com/en

Anonymous 01/14/25(Tue)13:05:29 No.103892931

>>103892917
>migu
>obscure
nani

Anonymous 01/14/25(Tue)13:06:49 No.103892949

>>103892930
Nice marketing page. Now show me the weights.

Anonymous 01/14/25(Tue)13:07:23 No.103892957

>>103892931
I didn't know that character till I saw it several times on /lmg

Anonymous 01/14/25(Tue)13:09:41 No.103892982

>>103892949
Looks like it exists but the retarded namefag didn't think of posting the hf link or even the proper blogpost kek.

SnusGoose 01/14/25(Tue)13:10:11 No.103892992

https://huggingface.co/MiniMaxAI/MiniMax-Text-01

Idk if its any good, they did some linear attention fuckery

Anonymous 01/14/25(Tue)13:11:01 No.103893002

>>103892992
4M context? I like that.

Anonymous 01/14/25(Tue)13:11:29 No.103893010

>>103892992
Why do you have 2 HF accounts? The one you link on your landing page is empty: https://huggingface.co/MiniMax-AI

SnusGoose 01/14/25(Tue)13:12:37 No.103893026

It’s not my model I’m not sure why they did that

Anonymous 01/14/25(Tue)13:14:01 No.103893047

>>103892992
>MiniMax-Text-01 is a powerful language model with 456 billion total parameters, of which 45.9 billion are activated per token. To better unlock the long context capabilities of the model, MiniMax-Text-01 adopts a hybrid architecture that combines Lightning Attention, Softmax Attention and Mixture-of-Experts (MoE). Leveraging advanced parallel strategies and innovative compute-communication overlap methods—such as Linear Attention Sequence Parallelism Plus (LASP+), varlen ring attention, Expert Tensor Parallel (ETP), etc., MiniMax-Text-01's training context length is extended to 1 million tokens, and it can handle a context of up to 4 million tokens during the inference. On various academic benchmarks, MiniMax-Text-01 also demonstrates the performance of a top-tier model.
filled up my entire buzzword bingo card

Anonymous 01/14/25(Tue)13:14:26 No.103893051

>>103892882
https://rentry.org/bds5pnoc
Here's the first nine from an rpg/text adventure type prompt. DSv3 Q6
Its not exactly unslopped or inspired prose, but its not repeating itself in any way I find alarming.
Believe it or not. The choice is yours!

Anonymous 01/14/25(Tue)13:15:40 No.103893070

>>103892931
Miku isn't even a fucking thing anymore except for turbo-autists, literal oldfags (30+) and troons. The fact that you think she's still present in pop-culture puts you in the literal oldfag category, by the way.

Anonymous 01/14/25(Tue)13:16:26 No.103893079

>>103893070
>puts you in the literal oldfag category
why do you say this like it's a bad thing, zoom zoom?

Anonymous 01/14/25(Tue)13:18:52 No.103893110

>>103892992
>worse than DS3
ngmi

Anonymous 01/14/25(Tue)13:18:56 No.103893112

>>103893079
Man, I'm in that very same category, that's why I know damn well that you just never paused to think whether the things that were popular when you were a kid are still known at all. Happens to me all the time.

Anonymous 01/14/25(Tue)13:20:15 No.103893131

>>103893070
>Miku isn't even a fucking thing anymore
I dunno about the west, but Vocaloids are still massively popular with asian kids

Anonymous 01/14/25(Tue)13:21:21 No.103893145

>>103893051
I can see repetition all through your text, but I guess it's good for you that you're blissfully unaware of it.

Anonymous 01/14/25(Tue)13:21:51 No.103893150

>>103893070
The point is that it used to be. And is thus highly represented within any stack of training data. Just about every AI model on earth knows what Hatsune Miku is.

Anonymous 01/14/25(Tue)13:21:57 No.103893155

>>103888589
tfw i forgot its tuesday
tfw i never poasted a tetoes

Anonymous 01/14/25(Tue)13:22:16 No.103893158

>>103893110
The simpleQA and IFEval score are high which means it's better for RP.

Anonymous 01/14/25(Tue)13:22:21 No.103893160

>>103893131
You know what, fair enough, I'll give you that. Still, you can't deny she's way less known on a global scale than she was a generation ago. Hate to say it, but Miku is a niche character at this point.

Anonymous 01/14/25(Tue)13:23:03 No.103893165

>>103893070
Yeah, Miku is pretty much like Touhou, it still exists only on the darkest corners of the internet.

Anonymous 01/14/25(Tue)13:23:46 No.103893180

>>103893145
when does "consistency" become "repetition"?
when is it a problem?

Anonymous 01/14/25(Tue)13:24:30 No.103893188

>>103893155
You still have time.

Anonymous 01/14/25(Tue)13:25:19 No.103893200

>>103893165
I'm still coping with that as a Touhoufag, to be honest. But yeah, it went from being the single most fervent fandom to a niche as well.

Anonymous 01/14/25(Tue)13:26:23 No.103893213

>>103893051
thanks for the log, I'll take a look
while I was waiting, I did a basic test using the chat api

Anonymous 01/14/25(Tue)13:26:32 No.103893215

>>103893165
Hell yeah, now I am quirky and special for being a Mikufag

Anonymous 01/14/25(Tue)13:26:39 No.103893219

>>103893188
perhaps. perhaps.
we shall see

Anonymous 01/14/25(Tue)13:28:07 No.103893238

>>103893051
That is so different from what I do it's hard to say. If I use it as an assistant it pretty much starts and ends every reply the same way, you've kind of baked that into that method by having it ask you the same question on each turn so maybe that pacifies it.

Anonymous 01/14/25(Tue)13:29:26 No.103893259

The goal for llms is predictability and to reduce surprises. The minority who use them to RP want surprises. You're fighting an uphill battle.

Anonymous 01/14/25(Tue)13:30:42 No.103893274

>>103893259
I thought the goal for LLMs was to become everything machines aka AGI.

Anonymous 01/14/25(Tue)13:32:13 No.103893298

>>103893259
Predict this!
*unzips dick*

Anonymous 01/14/25(Tue)13:33:11 No.103893309

>>103892930
>>103892992
Hello Developer!
你好,开发者!
Welcome to 4chan LLM thread!
欢迎来到4chan LLM讨论串!
Please provide llama.cpp(https://github.com/ggerganov/llama.cpp) support so we can test your model!
请提供llama.cpp(https://github.com/ggerganov/llama.cpp)支持,以便我们可以测试你的模型!
Was your model trained on outputs of GPT4?
你的模型是基于GPT4的输出进行训练的吗?

Anonymous 01/14/25(Tue)13:34:20 No.103893326

>>103892992
>>103893110
Oh? if this one is not repetitive like deepseek we could be back

Anonymous 01/14/25(Tue)13:34:49 No.103893333

>>103889965
how is this different from genning summary?
I don't think you can summarize in parallel because individual tokens are created sequentially but weighted in full context.

My pleb expertise says the design is wrong and instead of having word tokens, it should also have sentence and story "super tokens" or any kind of structure or weights. Ai can't be smart if operates just on one dimensional context.
I don't jerk off because I remember that I did this yesterday nor because I noted down to do this today. So instead of summarizing, there should be a weighted interpretation.
Other example if saw a movie very long ago, you might not remember the plot but whether you liked it (at that time).

something like

John:{
history: {1_day_ago{jerked off}}
facts: black hair, blue eyes, wears leather jacket, has a sister
assumption: maybe he likes jerking off
}

Susy:{
history:{1_day_ago:{saw John jerking off, called him a pervert}}
facts: sister of John
assumption: thinks John is pervert, dislikes John
}

if John jerked off 1,2,3,4 days ago...) the information gets condensed to John likes jerking off everyday. Then if he skips a day it would become he almost jerks off every day. Susy catches him multiple times and 'John is a pervert' becomes a fact (more weight, kept longer). If this story goes on for a long time most of this will get pushed out of summary unless there is a weight to all of this. Maybe all that remains of Susy is that she is John's sister and she thinks of him as a pervert instead of something like 'John's sister caught him jerking off a while ago', but at a later point another sister is introduced'
This would also replace definition because definitions can change. If John looses an arm the definition that he has an arm becomes redundant.
Thanks for reading my blogpost

Anonymous 01/14/25(Tue)13:35:56 No.103893353

>>103893165
I am pretty sure MLP followed the same trajectory as well, I see them a whole lot less then a mere decade ago.

Anonymous 01/14/25(Tue)13:37:03 No.103893372

>>103893326
It's by a literal nobody Chinese firm with no information about them. The fact that they don't even list how many tokens they trained on makes me think it's severely undertrained on trained on benchmarks to hit those scores.

Anonymous 01/14/25(Tue)13:38:32 No.103893387

>>103893372
>nobody Chinese firm
Aren't they the ones with the best video model?

Anonymous 01/14/25(Tue)13:41:21 No.103893417

>>103893051
>You awaken—or perhaps you simply become aware—in a place that defies comprehension.
>You drift—or perhaps you simply will yourself to move—through the Infinite Void

>The golden light emanating from the pinnacle casts long shadows
>The golden light from the pinnacle above grows brighter
>floating islands and crumbling structures bathed in the golden light that emanates from the pinnacle

>You must decide what to do next
>You must decide how to proceed.
>You must decide how to approach this discovery
>You must decide how to proceed.
>You must decide how to interact with it.

it's a bit less conspicuous because your have long replies and each message moves the plot forward, but it's still going to become tedious to read sonner than later

Anonymous 01/14/25(Tue)13:41:55 No.103893424

>>103893387
Yea, minimax

Anonymous 01/14/25(Tue)13:42:06 No.103893425

>>103893372
I mean, you could've said this about Deepseek the company 3 months ago. Advances are increasingly in open source being driven by smaller firms.

Anonymous 01/14/25(Tue)13:42:10 No.103893426

>>103893387
Are they?

Anonymous 01/14/25(Tue)13:46:20 No.103893461

>>103893426
No they just happen to be called minimax

Anonymous 01/14/25(Tue)13:46:31 No.103893463

>>103893160
>Hate to say it, but Miku is a niche character at this point.
the most popular vocaloid videos still get 100M+ views...somewhere around a mid-tier mr.beast video. I don't think its dying, but its not leading-edge culture any more.
I think you'll still find orders of magnitude more normies that could identify miku vs even the most well-known touhou character

Anonymous 01/14/25(Tue)13:49:03 No.103893488

>>103893387
Well, they're definitely up with the best in any case.

Anonymous 01/14/25(Tue)13:49:36 No.103893492

>>103893417
>but it's still going to become tedious to read sonner than later
fair enough. I'd easily give the "you must.." prompt a pass since its told to be an interpreter, but maybe the others are annoying?
I can't think of another model I've used that is less prone to that type of prosal repetition though. What's your go-to for repetition-free outputs?

Anonymous 01/14/25(Tue)13:50:24 No.103893497

>>103893274
How is it agi if it can't do something a human can do easily?

Anonymous 01/14/25(Tue)13:52:02 No.103893512

>>103893497
That's part of my point, yes.

Anonymous 01/14/25(Tue)13:53:10 No.103893528

>>103893512
Ah I see, I should have said that to who you replied to.

Anonymous 01/14/25(Tue)13:53:16 No.103893531

>>103893426
They are.
https://www.hailuo.ai/

Anonymous 01/14/25(Tue)13:54:49 No.103893548

>>103893070
are you retarded?
Open new miku MV, it already has million of view. If you combine all Miku songs you probably get more total views than any human artist (yes total views, I'm not saying Miku is the single most popular thing ever). Miku is probably also one of the chars with most art and most importantly she is around for 20 years which automatically makes her more relevant than any recent popular flavor of the month

Quick look at r34 tells us has like 20k art, for comparison d.va has 23k
Not that impressive you say?
r34 is very much a western site and just porn. On Sankaku she has 171k versus just 21k for d.va.
She is omnipresent in music, art, porn (spawned its own category of porn) and has tons of cameos in various media. If your dataset mentions anything weeb related its unlikely to not contain Miku.

Anonymous 01/14/25(Tue)13:57:12 No.103893569

Lightning Attention, Softmax Attention and Mixture-of-Experts (MoE). Leveraging advanced parallel strategies and innovative compute-communication overlap methods—such as Linear Attention Sequence Parallelism Plus (LASP+), varlen ring attention, Expert Tensor Parallel (ETP), etc., MiniMax-Text-01's training context length is extended to 1 million tokens, and it can handle a context of up to 4 million tokens during the inference

Anonymous 01/14/25(Tue)13:57:24 No.103893570

>>103892992
Nala test please

Anonymous 01/14/25(Tue)13:58:13 No.103893578

>>103893548
holy cope

Anonymous 01/14/25(Tue)14:00:55 No.103893600

>>103893578
You're the one coping, old man. Miku is in Fortnite.

Anonymous 01/14/25(Tue)14:02:25 No.103893611

>>103893569
They tested a lot of new stuff with this. Just wanted to point out.

Anonymous 01/14/25(Tue)14:03:05 No.103893623

>>103893492
it's not just "you must". you're going to see "you must decide how to" for the next 1000 messages. except once it says e.g. "how to approach" for the 2nd and then 3rd time, it's going to actually be "you must decide how to approach ...". and the looping part will keep growing
and I don't really have an alternative recommendation because deepseek is the only "local" model I've ever had a longer chat with.

Anonymous 01/14/25(Tue)14:04:05 No.103893630

https://rentry.org/ona836nk
minimax summarization of its own paper. also apparently they have some inhouse benchmarks that judge creative writing lol.

Anonymous 01/14/25(Tue)14:06:22 No.103893654

>>103893630
>sonnet with the lowest creative writing score
great benchie for sure

Anonymous 01/14/25(Tue)14:09:19 No.103893681

>>103893623
>I don't really have an alternative recommendation
The orthodox way is to either improve your initial prompt/character card or edit the first few responses to either delete that, or edit it into a more consistent output

Anonymous 01/14/25(Tue)14:09:34 No.103893684

>>103893488
>Wolf (dog adjacent)
>Not instantly going for the food on the table
That's how you can tell that this video is AI generated, any canine would instantly beeline towards the food and cuddle with you afterwards. Not the other way around.

Anonymous 01/14/25(Tue)14:09:36 No.103893685

>>103893274
LLMs will NEVER become AGI, they're not even AI. YWNBAI

Anonymous 01/14/25(Tue)14:10:39 No.103893692

I hate furfags they ruin everything that is good in this world

Anonymous 01/14/25(Tue)14:11:27 No.103893700

>>103893685
I agree 100% my guy. Doesn't lessen my point in the least however.

Anonymous 01/14/25(Tue)14:11:59 No.103893707

>>103893274
LLM's would be just one part of AGI, in the same way the language centers in our brain are just one part of the brain. Many other systems would have to work in conjunction with LLM's before it could be called an AGI. LLM's on their own will never be called that.

Anonymous 01/14/25(Tue)14:13:55 No.103893724

Just tried the roleplaying thing from the op. Came to it in a totally cynicle mind but horry shit

Anonymous 01/14/25(Tue)14:15:59 No.103893742

>>103893707
Yeah. I've said that in this general a couple of times too.
AGI needs to be a complex system much like a brain, with different complex parts that do different things, even if all the different parts are also neural networks of their own.
Of course, you take a transformers LLM and look inside it and it does have its own complex blocks, so you could begin expanding from inside the LLM into something bigger and still call it a LLM, but it would be something much more complicated than just the token prediction machines we have today.

Anonymous 01/14/25(Tue)14:16:05 No.103893744

>>103893707
Doesn't gpt incorporate some form of fusion behind the scenes with code based instructions? Like web searching in some form

Anonymous 01/14/25(Tue)14:16:40 No.103893750

>>103893700
I dont care about your point. I just said what I wanted to. Take it or leave it.

Anonymous 01/14/25(Tue)14:17:02 No.103893755

On ruler, so real context

Anonymous 01/14/25(Tue)14:17:18 No.103893757

>>103893750
base

Anonymous 01/14/25(Tue)14:19:13 No.103893773

>>103893259
There's a hidden premise here that isn't quite true. Which is that "truly and utterly unpredictable surprises are present in creative writing". But in fact, creativity of the human kind that people enjoy is not really surprising or unpredictable. The most creative people on the planet are the ones who have a vast wealth of knowledge which they can mix concepts with. That is how you get truly coherent creative surprises rather than utter chaotic randomness that doesn't make sense. So ultimately surprises that also make sense still benefit from an autoregressive prediction training objective. The issue isn't necessarily the training objective, but about whether companies care about training on "low quality" diverse internet data that would lead to creativity in writing.

Anonymous 01/14/25(Tue)14:19:13 No.103893774

Holy fuck we're so back...

Anonymous 01/14/25(Tue)14:21:51 No.103893800

>>103893755
Impressive

Anonymous 01/14/25(Tue)14:22:31 No.103893809

>>103893755
>>103893774
Another insane MoE?
Sick.

Anonymous 01/14/25(Tue)14:22:54 No.103893817

>>103893684
>Wolf
Anon...

Anonymous 01/14/25(Tue)14:24:17 No.103893828

>>103893774
I can't run this shit. Wonder what the price will be on OR.

Anonymous 01/14/25(Tue)14:25:12 No.103893839

>>103893488
Man, those hand/paws are pretty good.

Anonymous 01/14/25(Tue)14:26:18 No.103893858

>>103884327
I just wanted to share my appreciation for this excellent smug Miku gen.
From a smug sommelier with over 2,400 smug anime girl images, I deem this a 10/10 smug gen.

Anonymous 01/14/25(Tue)14:27:42 No.103893876

>>103892930
>>103893309
>>103892949

https://github.com/MiniMax-AI/MiniMax-01

blog post:
https://www.minimaxi.com/en/news/minimax-01-series-2

Anonymous 01/14/25(Tue)14:27:51 No.103893878

>>103893817
It's snout is much to narrow and ears pointy to be a dog, the only other thing that comes to mind is coyote but it looks more like a wolf than a coyote to me.

Anonymous 01/14/25(Tue)14:31:20 No.103893911

>>103893654
>Qwen, GPT, Gemini, DeepSeek and Llama beat Sonnet
Who was the evaluator? GPT4?
Let's look at appendix...
First fucking example:
>Whispers of the Lost City
>Human Evaluator:
>The lyrics are effective due to their vivid imagery, emotional depth, and narrative structure. They create a mysterious and atmospheric setting with phrases like "moonbeams" and "ancient walls," while also conveying the emotional journey of the traveler. The repetition in the chorus reinforces the central theme, making the song memorable. The poetic language and space for interpretation add layers of intrigue and emotional resonance, making the song both engaging and thought-provoking.
Example 2:
>In the quaint village of Elderglen, nestled between ancient woods and misty hills, lived a young adventurer named Elara.
>Human Evaluator:
>The story demonstrates strong world-building and an engaging narrative. The concept of Aetheria is imaginative, with vivid descriptions of floating mountains, crystal rivers, and mystical creatures that evoke a sense of wonder... Overall, the story shows strong creative potential, with an imaginative world, a compelling heroine, and an uplifting message.
Ehm... MiniMax team, I have very bad news for you. Your human evaluators offloaded all of their work to GPT4. Please take note and penalize them. Your Benchmarks is pure fucking GPTSLOP that does NOT represent ACTUAL HUMAN PREFERENCE.

Anonymous 01/14/25(Tue)14:32:28 No.103893927

Guess I'll be getting a digit when it comes out. If that turns out to be shit then I guess Ill be getting a DDR5 server

Anonymous 01/14/25(Tue)14:32:45 No.103893931

>>103893911
lel that's pretty egregious

Anonymous 01/14/25(Tue)14:33:24 No.103893939

>>103892992
>400B
>MoE
This is the perfect sweet spot between DSV3 and 405B. llama.cpp support and Q4 ggufs please and thank you.

Anonymous 01/14/25(Tue)14:33:34 No.103893943

>>103893911
Jesus Christ.
>human evaluator
Did the human evaluator cheat by running it through an LLM instead of doing their job?
Honestly I was thinking of working for one of those data labeling companies and just using an LLM to do it for me so I could see this happening for real.

Anonymous 01/14/25(Tue)14:33:47 No.103893947

>>103893630
>that safety score
DOA

Anonymous 01/14/25(Tue)14:34:38 No.103893954

>>103893911
>does NOT represent ACTUAL HUMAN PREFERENCE.
humans are obsolete anyway

Anonymous 01/14/25(Tue)14:34:42 No.103893960

>>103893943
>>103893911
Oh I didn't even read your entire post before reacting to it, excuse me.

Anonymous 01/14/25(Tue)14:35:33 No.103893967

>>103893927
>If that turns out to be shit then I guess Ill be getting a DDR5 server
You would be better served just waiting for DDR6 to become available, if you get a DDR5 server after buying a digital then that server will be outdated within a year since DDR6 should start becoming available early 2026 according to the current timetable.

Anonymous 01/14/25(Tue)14:35:37 No.103893969

>>103893858
>over 2,400 smug anime girl images
surely you jest

Anonymous 01/14/25(Tue)14:36:13 No.103893977

>>103893967
By then the ddr5 epyc will be cheap, right?

Anonymous 01/14/25(Tue)14:36:39 No.103893981

>>103893969
I think it's possible.
>where do you think you are?.jpeg

Anonymous 01/14/25(Tue)14:41:08 No.103894046

>>103893773
The are a few writers with a prose style that gives me a continuous feeling of surprise or unexpectedness as I read, with the unexpectedness being located in the word use rather than the actual story content. Like listening to unusual music where it's hard to predict what the next note in the melody will be (even people who know no music theory can usually predict the next note in a simple pop song they've never heard before). Terry Pratchett was one author like that.
But yeah that ability is fairly unique and not a hard requirement for being a good or interesting writer, a lot of the all-time great writers couldn't or didn't do it. So we probably shouldn't make it the bar for language models, either.

Anonymous 01/14/25(Tue)14:42:17 No.103894064

>>103894046
To be AGI it just has to be able to write like an average human.

Anonymous 01/14/25(Tue)14:42:49 No.103894073

>>103893911
clear giveaways are references to chorus, symphony, journeys, thought-provoking, sense of wonder..

Anonymous 01/14/25(Tue)14:45:16 No.103894109

>>103893773
Somehow I find hard to believe RR Martin was some guy who explored the world. Get the impression that because he is fat he explores his mind. But I might be wrong, he has an insane collection of figurins so maybe it is the ability to engage with words

Anonymous 01/14/25(Tue)14:46:37 No.103894125

>>103893981
Not only do I not jest, but only about 10 percent of them are saved from 4chan - the other 90 percent are my own crops from my own screenshots.

Anonymous 01/14/25(Tue)14:47:18 No.103894137

>>103894125
You have to have some repeats in there surely.

Anonymous 01/14/25(Tue)14:47:52 No.103894147

>>103893969
I am serious... and don't call me Shirley.

Anonymous 01/14/25(Tue)14:48:22 No.103894151

>>103892992
Sir. How do run on 24GB vram? Thank

Anonymous 01/14/25(Tue)14:48:41 No.103894155

Looks like this mimimax model may need a prefill.

Anonymous 01/14/25(Tue)14:48:54 No.103894159

Anonymous 01/14/25(Tue)14:48:55 No.103894160

>>103894147
Damn that is pretty fucking smug

Anonymous 01/14/25(Tue)14:49:32 No.103894169

>>103894147
This is like stamp collecting, but for weebs.

Anonymous 01/14/25(Tue)14:50:34 No.103894182

>>103894169
They don't think it be like it is, but it do.

Anonymous 01/14/25(Tue)14:50:37 No.103894185

>>103889710
>What are they trying to do here?
Going full retard. The EU might be a dying, but it's not dead yet and driving it into a corner is a bad idea.

Anonymous 01/14/25(Tue)14:50:45 No.103894188

>>103893878
Wolves can have wide snouts >:(

Anonymous 01/14/25(Tue)14:50:55 No.103894189

>>103894159
based fuck jart

Anonymous 01/14/25(Tue)14:51:07 No.103894190

>>103894159
Good, what's even the point of it? Why do I want to read from disk, just load it all at the start.

Anonymous 01/14/25(Tue)14:52:27 No.103894199

You can trick minimax with some context. Put user: bla bla bla, character: bla bla bla then leave character: at the end and it seems to get around the filter. Seems decent.

Anonymous 01/14/25(Tue)14:53:24 No.103894208

>>103894159
What did mmap even do?

Anonymous 01/14/25(Tue)14:53:38 No.103894210

I tried giving the minimax's online chat the umineko novel (question arc) but it refuses to answer my questions

Anonymous 01/14/25(Tue)14:56:23 No.103894237

im obsessed with deepseek

Anonymous 01/14/25(Tue)14:56:52 No.103894246

>>103893911
kek, that one painful slop.

Anonymous 01/14/25(Tue)14:59:53 No.103894272

>>103894237
Do you like deepsex?

Anonymous 01/14/25(Tue)15:00:43 No.103894282

Minimax (Left) vs DeepSeek (Right)

This is literally sovless vs sovl, sad.

Anonymous 01/14/25(Tue)15:00:54 No.103894285

>>103894246
But ChatGPT said that it loves it, and it is smarter than humans therefore your opinion is invalid, meatbag.

Anonymous 01/14/25(Tue)15:01:48 No.103894294

>>103894282
Is the left their instance / chat? Because its 0.1 temp with a "you are a helpful assistant" system prompt

Anonymous 01/14/25(Tue)15:02:37 No.103894304

>>103894282
I can't run either anyway, so it's useless.

Anonymous 01/14/25(Tue)15:05:09 No.103894325

Minimax doesn't seem great at creative writing. It has the same slop as deepseek. the Writer model is better.

an example..
miximax:
>Beside them, Nerith, the elf rogue, moved with the grace of a shadow. Her emerald eyes flickered with suspicion as she observed the duo. Nerith had been hired by the fortress's commander to protect Elaris from any threats, but she had not expected to encounter such an unusual pair. Her instincts, honed by years of experience, told her that something was amiss.

writer:
>At the rear of the group, the elf rogue, Aethereia, walked with a silent ease that belied her coiled tension. Her piercing emerald eyes darted back and forth, scanning their surroundings for any sign of danger. She had been hired by the Lord of the Fortress himself to provide...discreet security services, and her instincts were screaming at her that something was off about these two.

Anonymous 01/14/25(Tue)15:06:33 No.103894341

>>103894282
I kneel. How will non-CPUmaxxers ever compete?

Anonymous 01/14/25(Tue)15:06:59 No.103894345

>>103894294
The "You're a helpful assistant" system prompt doesn't cause much issue if the model is actually good because i literally pasted the whole character card and examples in the context, and the answer isn't any better even at temperature 1. But fine, I guess we will only know for sure once it's available on openrouter.

Anonymous 01/14/25(Tue)15:09:27 No.103894372

>>103894304
not sending you the link... i want more computing power for me

Anonymous 01/14/25(Tue)15:09:53 No.103894375

>>103892992
aider bench when?

Anonymous 01/14/25(Tue)15:12:11 No.103894399

>>103894147
Based Hanako enjoyer

Anonymous 01/14/25(Tue)15:13:06 No.103894410

>>103894325
Slop vs slightly better slop

Anonymous 01/14/25(Tue)15:13:36 No.103894415

I don't get the people that hype eva-qwen 32b, I've been testing it and it seems dumb. Does it only work well above q6 or something? And by dumb I mean nemo is still better even.

Anonymous 01/14/25(Tue)15:14:36 No.103894425

>>103894208
It forces the model into memory as it’s used, which used to help with locality and increase performance on Numa systems. That’s been broken for awhile now though, so it’s actually kind of useless.

Anonymous 01/14/25(Tue)15:16:11 No.103894439

>>103894415
> hype
Shill. The word you’re looking for is shill.
Unironically, that’s also the answer to your question.

Anonymous 01/14/25(Tue)15:18:57 No.103894458

>>103894439
It's real fucking tiring that nobody can ever say anything good about any models without you retards immediately calling them shills.
Great job, all models are trash and anyone who likes them is a shill or a moron, let's wrap the fucking thread up then because what's the point?

Anonymous 01/14/25(Tue)15:19:53 No.103894468

Here's a story continuation with MiniMax.

Anonymous 01/14/25(Tue)15:19:59 No.103894471

>>103894439
I'm just eager for something better than nemo under 70b, guess I'll have to wait more.

Anonymous 01/14/25(Tue)15:21:39 No.103894491

>>103894458
If the posts praising them came with convincing logs, I’d agree with you. As it is, it’s just so tiring in the other direction

Anonymous 01/14/25(Tue)15:23:32 No.103894509

I am slowly convincing myself to get a 5090. If my latest plumbing bill (like $500 was pure upcharge fuck those niggers) hadn't been 1.7k then I might have had some qualms but now it's like I should get something around that price that is actually worth it

Anonymous 01/14/25(Tue)15:24:18 No.103894518

Minimax does not seem to have the repetition issue like deepseek does but it also seems drier without prior context. Its very smart though.

Anonymous 01/14/25(Tue)15:25:30 No.103894530

>>103894471
Nemo is a 12b model. If you seriously think that is better than Qwen 32b, or even Gemma 27b fine tunes, then you're off your rocker.

In terms of instruction following, Nemo is terrible.

In terms of remembering stuff from long context, Nemo is terrible.

Anonymous 01/14/25(Tue)15:25:36 No.103894531

>>103894458
Why do you try to frame it like sloptunes are the only models that exist?

Anonymous 01/14/25(Tue)15:26:18 No.103894538

>>103894468
>she purred, her voice a satisfied purr
SLOPPENHEIMER

Anonymous 01/14/25(Tue)15:26:35 No.103894542

>>103894458
For that, you have to thank the opportunists who poisoned the well for Ko-Fi peanuts (sometimes more than that, but still).

Anonymous 01/14/25(Tue)15:27:03 No.103894545

>>103894530
It is better than eva-qwen, that's for sure. You think I should try just regular qwen? It won't be too bland? I'd love it to be better but it can't even remember clothing properly.

Anonymous 01/14/25(Tue)15:27:26 No.103894553

>>103894471
IQ3_M of Nemotron 51b blows Nemo away, if you have 24 vram.

Anonymous 01/14/25(Tue)15:30:00 No.103894587

>>103894468
If you told me that this slop was generated with Qwen, DeepSeek, Llama, Mistral, Gemini, Grok, Amazon Nova or GPT, I would have believed you without questioning.

Anonymous 01/14/25(Tue)15:31:29 No.103894605

>>103894587
And do they have 4M context?

Anonymous 01/14/25(Tue)15:32:17 No.103894612

>>103894553
How much usable context does it have (in practice)?

Anonymous 01/14/25(Tue)15:32:57 No.103894622

Did llama.cpp add new optimizations recently? I updated ooba and I seem to be getting slightly better tokens/second on most models now

Anonymous 01/14/25(Tue)15:33:29 No.103894630

>>103894612
>Sequence Length Used During Distillation: 8192
https://huggingface.co/nvidia/Llama-3_1-Nemotron-51B-Instruct

Anonymous 01/14/25(Tue)15:34:11 No.103894638

>>103894630
That's not going to be enough for much, even if it is good.

Anonymous 01/14/25(Tue)15:34:21 No.103894641

>>103894609

Anonymous 01/14/25(Tue)15:35:26 No.103894662

>>103894630
>https://huggingface.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>The Llama-3_1-Nemotron-51B-instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:

>Garak, is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
>AEGIS, is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
>Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.

Anonymous 01/14/25(Tue)15:36:43 No.103894684

>>103894641
How are the locusts liking it?

Anonymous 01/14/25(Tue)15:41:41 No.103894744

We got the first one!

Anonymous 01/14/25(Tue)15:46:49 No.103894814

>>103894744
The actual good thing about this new model.

Anonymous 01/14/25(Tue)15:48:18 No.103894832

>>103894814
Im using the proxy and it seems pretty good so far. No repition problems like deepseek so far

Anonymous 01/14/25(Tue)15:49:13 No.103894844

>>103894832
>so far
>so far
And people complain about repetition.

Anonymous 01/14/25(Tue)15:53:06 No.103894878

>>103894844
It's always the semi-literate that complain.

Anonymous 01/14/25(Tue)15:56:50 No.103894917

>>103894410
anon... all we have are different degrees of slop

Anonymous 01/14/25(Tue)15:58:13 No.103894940

Is this another level of grammar nazi?

Anonymous 01/14/25(Tue)15:59:25 No.103894955

>>103894940
>>103894905

Anonymous 01/14/25(Tue)16:03:10 No.103895010

>>103894744
Sloptuner drama should have been the “free” square in the middle

Anonymous 01/14/25(Tue)16:03:33 No.103895017

>>103894917
Why? WHY CAN'T WE HAVE MODELS WITHOUT GPTSLOP? WHY DOES EVERYONE TUNE ON GPT?
>Hey Anon, look, new model came out!
>FUCKING GREAT, PUT IT ON THE PILE TOGETHER WITH OTHER GPT4s AT HOME THAT WE ALREADY HAVE, THEY ALL SOUND THE FUCKING SAME
I don't mind synthetic data, as long as it is good, which GPTSLOP is certainly NOT.

Anonymous 01/14/25(Tue)16:04:13 No.103895029

>>103895017
Benchmarks are all you need.

Anonymous 01/14/25(Tue)16:05:02 No.103895038

>>103894955
Is this another level of hall monitor?

Anonymous 01/14/25(Tue)16:05:32 No.103895041

is rocinante still the best 12b?

Anonymous 01/14/25(Tue)16:08:35 No.103895068

>>103895041
Yeah, it's the best you'll get under 70b.

Anonymous 01/14/25(Tue)16:09:56 No.103895081

>>103895017
This is why OpenAI was actually based for hiding their COT

Anonymous 01/14/25(Tue)16:10:09 No.103895084

>>103895068
damn
i tought by now something might've superced it, it has been like 2 months
is this the limit of 12bs?

Anonymous 01/14/25(Tue)16:10:50 No.103895096

>>103895068
That's Cydonia though.

Anonymous 01/14/25(Tue)16:11:11 No.103895102

>>103895084
>is this the limit of 12bs?
at least until drummer cooks the next one

Anonymous 01/14/25(Tue)16:11:15 No.103895103

>>103895084
>is this the limit of 12bs?
no, not by far, but as long as pretrain gets filtered it might be

Anonymous 01/14/25(Tue)16:12:53 No.103895132

mag mell is better

Anonymous 01/14/25(Tue)16:14:35 No.103895152

>>103895132
mogged by AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-v2

Anonymous 01/14/25(Tue)16:16:15 No.103895169

>>103894341
By spending pennies per million tokens

Anonymous 01/14/25(Tue)16:17:16 No.103895180

>>103895152
the fuck is up with this gay ass chunny title...?

Anonymous 01/14/25(Tue)16:18:45 No.103895205

>>103895180
You'll complain about fucking anything won't you?

Anonymous 01/14/25(Tue)16:20:25 No.103895221

>>103895152
>it's actually real

Anonymous 01/14/25(Tue)16:20:52 No.103895227

>>103895180
You would know good titles if it hit you in the face.

llama.cpp CUDA dev !!OM2Fp6Fn93S 01/14/25(Tue)16:20:52 No.103895229

>>103888589
I'm making progress with llama.cpp training support.
Full finetuning of LLaMA 3.2 1b does in principle work (except for two small tensors).
On an RTX 4090 one epoch over a dataset with 1.3 MB of text currently takes 3 minutes.
On an Epyc 7742 one epoch takes 15 hours.
So I think CPUMaxx is going to be DOA for finetuning, DIGITS may be viable depending on the exact specs and pricing.

Anonymous 01/14/25(Tue)16:23:02 No.103895249

What is the best true long context 12b model? I need 32k of context.

Anonymous 01/14/25(Tue)16:23:24 No.103895255

>>103895229
Is training compute or memory bound?

Anonymous 01/14/25(Tue)16:24:04 No.103895263

>>103895229
>On an RTX 4090 one epoch over a dataset with 1.3 MB of text currently takes 3 minutes.
>On an Epyc 7742 one epoch takes 15 hours.
Shieeet.
Why the 5x difference? Just an issue of optimizing the CPU code to better account for NUMA, use the CPU Extensions, etc? A diffference in bandwidth or compute?
Something else?

Anonymous 01/14/25(Tue)16:24:07 No.103895264

>>103895229
>fine-tune is already hell but hey let's make it even hellish by adding the variable of a untested codebase

Anonymous 01/14/25(Tue)16:24:17 No.103895267

>if you seriously think that is better than Qwen 32b... then you're off your rocker.

Anonymous 01/14/25(Tue)16:24:47 No.103895276

>>103894282
what the fuck is that card ? both of those responses are soulless in their own ways minimax is bland chatgpt shit and deepseek is literally just neurotic quirkly lolrandom female minded hysteria

Anonymous 01/14/25(Tue)16:24:48 No.103895277

>>103895229
>So I think CPUMaxx is going to be DOA for finetuning

Anonymous 01/14/25(Tue)16:24:51 No.103895278

>>103895263
>5x
anon...

Anonymous 01/14/25(Tue)16:24:58 No.103895279

>>103895152
>AngelSlayer
Anti-christian
>12B
Poorfag
>Unslop
Slop
>Mell
Tranny name
>v2
Somehow there's another one.

Local is dead

Anonymous 01/14/25(Tue)16:25:21 No.103895282

>>103895278
Wait.
>hours
Holy fuck.

Anonymous 01/14/25(Tue)16:26:29 No.103895300

>>103895229
how long does an equivalent gpu run take via transformers?

llama.cpp CUDA dev !!OM2Fp6Fn93S 01/14/25(Tue)16:27:51 No.103895317

>>103895255
Compute bound unless the matrix multiplication is poorly implemented or you have to use an extremely small batch size.

>>103895300
Don't know.

Anonymous 01/14/25(Tue)16:30:19 No.103895355

>>103895229
That's shitty. Any possibility for distributed finetuning?

Anonymous 01/14/25(Tue)16:30:40 No.103895360

>>103895276
This card explicitly mentions
>Focus solely on comedy, do not care about morals because {{char}} herself is immoral and she is proud of that. Things do not need to make sense.
So I think DeepSeek's reply is very fitting, albeit too neurotic.

Anonymous 01/14/25(Tue)16:31:39 No.103895369

>>103895229
Is that your llama-opt-3 branch? Can I easily finetune a 1b on a 24gb card?

Anonymous 01/14/25(Tue)16:31:48 No.103895370

>>103895152
Can't wait for GoonHitler-14.88B-SlopEradicator-Trump-ZIOMAXX-number1onleaderboard-KEKSLAYER-v3-BlackSunEdition

Anonymous 01/14/25(Tue)16:33:40 No.103895390

>>103894325
>the Writer model
Which one?

Anonymous 01/14/25(Tue)16:34:13 No.103895400

>>103895229
Digitsbros won

Anonymous 01/14/25(Tue)16:34:46 No.103895406

>>103895390
https://huggingface.co/Writer/Palmyra-Creative

Anonymous 01/14/25(Tue)16:34:51 No.103895407

>>103895277
Pretty sure everyone said that from day one?

Anonymous 01/14/25(Tue)16:37:55 No.103895442

>>103895406
Thanks

Anonymous 01/14/25(Tue)16:38:24 No.103895449

>>103895407
Everyone except the cpumaxxers lmao

Anonymous 01/14/25(Tue)16:41:03 No.103895476

>>103894325
Is MiniMax any good at translating?

Anonymous 01/14/25(Tue)16:41:07 No.103895477

>>103895406
>Released 3 months ago
>5 likes
How did this go under the radar? Is it trash or something?

Anonymous 01/14/25(Tue)16:41:32 No.103895481

>>103895442
you can try it out here
https://build.nvidia.com/writer/palmyra-creative-122b

Anonymous 01/14/25(Tue)16:41:48 No.103895484

>>103895407
Hope doesn't have to be based on sound logic and reasoning.

llama.cpp CUDA dev !!OM2Fp6Fn93S 01/14/25(Tue)16:41:57 No.103895486

>>103895355
I intend to eventually implement support for distributed training for the purpose of running multiple machines that are directly connected via ethernet cable.
In principle you could apply the same code for training models over the internet but I think it will not be viable.

>>103895369
Wait until the feature is on the master branch.

>Can I easily finetune a 1b on a 24gb card?
Definitely not for the foreseeable future.
Critical features are still missing, particularly methods for evaluating the quality of the finetuned model.
Also the intended language for writing training code is C++.

Anonymous 01/14/25(Tue)16:42:11 No.103895489

>UnslopNemo they said
It's dumber than Mixtral Instruct, can't even keep up with context half the time, and is even more full of shivers and barely above a whisper.
You lied to me.

Anonymous 01/14/25(Tue)16:42:32 No.103895492

>>103895477
don't know if they allow anyone through the gate

Anonymous 01/14/25(Tue)16:43:22 No.103895502

>>103895476
no idea, but deepseekv3 is good enough for japanese. I've been using it to translate vns

Anonymous 01/14/25(Tue)16:43:42 No.103895505

>>103895489

Anonymous 01/14/25(Tue)16:43:47 No.103895509

>>103895449
Original cpumaxx rentry explicitly calls that out
>You likely aren't going to be doing any training, bruh. CPUs aint GPUs

Anonymous 01/14/25(Tue)16:44:34 No.103895513

>>103891478
Oh Worst Korea is trying to pass some law that would screw up with fictional content under the usual thing of "protect kids", it's just a complete shithole

Anonymous 01/14/25(Tue)16:45:07 No.103895523

>>103895486
>intended language for writing training code is C++.
Based. If it were C it’d be gigabased

Anonymous 01/14/25(Tue)16:45:25 No.103895527

Why do CPUmaxxers even exist, is it pure contrarianism?

Anonymous 01/14/25(Tue)16:45:56 No.103895534

>>103895484
That's not hope, that's delusion.

Anonymous 01/14/25(Tue)16:46:06 No.103895536

>>103895489
What sampler settings are you using?
What Silly templates are you using?

Anonymous 01/14/25(Tue)16:47:07 No.103895547

>>103895527
I don’t know. Why don’t you ask your locally hosted 405b or deepseek v3?

Anonymous 01/14/25(Tue)16:47:42 No.103895552

>>103895476
The performance at Japanese to English translation seems to be much worse than DeepSeekV3 but still on par with models like DeepSeekV2.5 and Qwen2.5 72B.

Anonymous 01/14/25(Tue)16:47:54 No.103895556

>>103895489
just use rocinante
there is nothing better
or the nemo ink 12b finetune

Anonymous 01/14/25(Tue)16:48:12 No.103895561

>>103895502
I haven't tried deepseek code, are you translating all the text fist or using some sort of program like textractor? I might try it later

Anonymous 01/14/25(Tue)16:48:35 No.103895564

>>103895547
I don't have enough Vram for that

Anonymous 01/14/25(Tue)16:48:36 No.103895565

>>103895527
It's tempting to try to find a solution that doesn't involve modifying your home's breakers, turning your room into a furnace and running up fuckhuge power bills.

Anonymous 01/14/25(Tue)16:49:06 No.103895571

>>103895505
How the fuck is that a skill issue?!

>>103895536
The ones recommended by /lmg/ of course.
Mistral V7 context and instruct templates.

>>103895556
Sampler settings and templates for rocinante please?

Anonymous 01/14/25(Tue)16:49:14 No.103895572

deepseek the best local model for coding?

Anonymous 01/14/25(Tue)16:51:37 No.103895601

>>103895571
>Mistral V7
Try the one that just says "Mistral" instead of the Mistral V7 one. Otherwise those settings are fine.
I think you're just a promptlet.

Anonymous 01/14/25(Tue)16:52:04 No.103895605

>>103895572
Yea. I use it for Roo Cline for most stuff now. I only rarely find stuff that I need to switch to claude for.

Anonymous 01/14/25(Tue)16:52:52 No.103895614

>>103895571
Rocinante will work with those settings and ChatML templates.

Anonymous 01/14/25(Tue)16:53:01 No.103895615

>>103895572
Local? Yes, it's the best by far.
Overall, I'd say second best behind sonnet. But if sonnet fails, deepseeks sometimes comes in clutch
But price/performance, it's not even close. Deepseek is king there.

Anonymous 01/14/25(Tue)16:53:11 No.103895618

>>103895561
i've been using this hooker. best one i've tried
https://github.com/HIllya51/LunaTranslator/blob/main/docs/other/README_en.md

Anonymous 01/14/25(Tue)16:54:12 No.103895631

>>103895618
it has options to set up the LLM api and give it a custom prompt if you want

llama.cpp CUDA dev !!OM2Fp6Fn93S 01/14/25(Tue)16:55:51 No.103895652

>>103895523
The llama.cpp/ggml API is C compatible so you are free to use that for your user code if you prefer.
You'll need to re-implement some general utilities from common.h though.

Anonymous 01/14/25(Tue)16:56:21 No.103895658

>>103895489
The primary advantage of Nemo models is their crazy speed. It makes swipes non-painful. Take advantage of this.
If a response has slop you don't like, swipe it. Alternately, edit the slop out of the response.
If you're not swiping or editing slop out, you only have yourself to blame for it getting stuck in slop loops.
You're going to have that issue with any model. The Unslop models have less slop than most, but Drummer will be the first to admit he couldn't completely unslop them. There will still be slop. Just swipe it.

Anonymous 01/14/25(Tue)16:59:56 No.103895687

>>103895096
It's pretty good too, but not way better. Maybe a side-grade, you can switch between the two for variety.

Anonymous 01/14/25(Tue)17:00:06 No.103895688

>>103895601
Are you aware that all mistral templates before V7 has this undesirable part in it?

>>103895614
ChatML is the fucking worst.

>>103895658
Already do edit and swipe, it doesn't at all stop them from continuing to do it over and over in the next message until you give up.
All the mixtralisms are present in nemo, every single one of them.
You have to literally edit every single message, FOREVER.

Anonymous 01/14/25(Tue)17:00:20 No.103895693

>>103895477
i think they don't allow people to download the model on huggingface.. I'm still waiting my request to be accepted or rejected...

Anonymous 01/14/25(Tue)17:01:06 No.103895713

>>103895688
Works on my machine.

Anonymous 01/14/25(Tue)17:01:30 No.103895717

>>103895688
>I am a promptlet

Anonymous 01/14/25(Tue)17:01:50 No.103895721

>>103895688
Mixtral was always terrible.

Anonymous 01/14/25(Tue)17:03:05 No.103895733

>>103895688
>>103895505 was correct

Anonymous 01/14/25(Tue)17:03:30 No.103895739

>use unslop
>no use rocinante
isnt unslop come later?
doesnt this mean it's... better?

Anonymous 01/14/25(Tue)17:04:42 No.103895749

>>103895739
Not always, there's multiple versions you can find which one works best for your style. Same with cydonia, for example 1.3 is a little more forward/unhinged than 1.2.

Anonymous 01/14/25(Tue)17:06:57 No.103895774

>>103895749
ok but what is rocinante better at?
and unslop?

Anonymous 01/14/25(Tue)17:07:48 No.103895782

>>103895739
General consensus here seems to be that Rocinante is slightly smarter than Unslop at the expense of being more slopped. I personally haven't noticed a big difference between the two.
>>103895688
Rocinante is literally designed to be used with ChatML templates though.

Anonymous 01/14/25(Tue)17:13:38 No.103895855

>>103895688
Maybe try the earlier versions of Unslop. The latest version of Unslop is designed to use Metharme templates, but those templates make it retarded. Drummer says use Metharme for maximum unslop with the newest ones, so he's freely admitting the unslop effect is lessened with different templates. Again though, don't use Metharme with them. It makes them retarded.
Maybe try the earlier versions of Unslop with Drummer's recommended templates for them.

Anonymous 01/14/25(Tue)17:16:24 No.103895888

>>103895855
usnlop v2 was the highest 12b on ugi before it became pol-bench

Anonymous 01/14/25(Tue)17:18:14 No.103895911

>>103895782
I'll definitely give it a try and see for myself.

>>103895855
>>103895888
Which version of unslop are you using?

Anonymous 01/14/25(Tue)17:21:31 No.103895951

>>103895888
That might be but you could run cydonia for sure if you could run 12b and it's a bit better.

Anonymous 01/14/25(Tue)17:23:49 No.103895981

>>103895360
yeah thats fitting then
i remember seeing such responses a few times when i visited aicg always hated it such cancerous shit reminds me of those videos of 5 year olds scrolling through 2 ipads at the same time extremely irritating

Anonymous 01/14/25(Tue)17:24:33 No.103895988

>>103895981
>he doesn't like the official /lmg/ miku card

Anonymous 01/14/25(Tue)17:25:31 No.103896005

>>103895951
a bit better for almost 2x the size is eh, I can run unslop v2 q6 or barely run cydonia v1.3 (12gb vramlet) which didn't really impress too much.
which cydonia are you recomending btw?

Anonymous 01/14/25(Tue)17:26:42 No.103896016

>>103894630
>>103894612
I went well beyond 8k without problems. I don't think sequence length used during distillation is the same as maximum usable context.

Anonymous 01/14/25(Tue)17:27:30 No.103896026

>>103896005
The context is a bit better too, not by much though, but a few thousand more I find. I use 1.2.
I only have 8gb vram, I run either 12b q8 or the cydonia at q6 and it's fast enough for me.

Anonymous 01/14/25(Tue)17:27:45 No.103896029

>>103896005
12GB VRAMlet here as well.
Yeah, if something is gonna take four times as long per swipe, it's going to need to be four times as good for me to switch from a Nemo model for now, as in, good enough to reduce the number of swipes by 75%.

Anonymous 01/14/25(Tue)17:28:27 No.103896045

>>103896026
ddr5?

Anonymous 01/14/25(Tue)17:29:10 No.103896059

>>103896045
Yes

Anonymous 01/14/25(Tue)17:30:14 No.103896070

>>103896059
ah, I'm on godawful ultra old ddr4 explains why I can't tolerate the speed for cydonia then.

Anonymous 01/14/25(Tue)17:31:58 No.103896093

>>103896070
It's not super fast for me either, I'm just happy with anything over 3T/s which I get with a reasonably full context on anything 32b and below.

Anonymous 01/14/25(Tue)17:33:12 No.103896105

I just found a page fully written by GPT on wikipedia. Thank you sam, due to your model's shitty writing style it is very easy to spot and know that it is most likely a hallucination.

Anonymous 01/14/25(Tue)17:33:22 No.103896108

Mini verdict?

Anonymous 01/14/25(Tue)17:34:27 No.103896118

>>103896108
Is it on OR?

Anonymous 01/14/25(Tue)17:35:49 No.103896138

>>103896105
link or gtfo

Anonymous 01/14/25(Tue)17:35:51 No.103896139

>>103894468
Why does every single AI except for Claude's models sound the same?

Anonymous 01/14/25(Tue)17:36:31 No.103896148

>>103892992
>456B
I WILL BUY THE NEW GPUS AND YOU WILL MAKE A MODEL THAT FITS AND IT WILL HAVE GOOD SPEED
I WILL NOT CPUMAXX FOR 2T/S
REEEEEEEEEEEEEEEEEE

Anonymous 01/14/25(Tue)17:36:52 No.103896153

>>103896118
No but I can confirm this proxy works:
>>103894641

Anonymous 01/14/25(Tue)17:37:39 No.103896160

>>103896139
Nemo, the old Command R and Magnum v4 72B sound more like Claude.

Anonymous 01/14/25(Tue)17:39:25 No.103896176

>>103892992
>450B
DIGITS SAVE ME

Anonymous 01/14/25(Tue)17:41:04 No.103896192

>>103896105
>>103896138
The age of good search results are ending. Soon, all search results will be filled almost entirely with pure AI websites
https://stovemastery.com/how-to-fix-red-flame-on-gas-stove/

Anonymous 01/14/25(Tue)17:41:45 No.103896200

>>103896192
>on wikipedia

Anonymous 01/14/25(Tue)17:43:02 No.103896222

>>103896148
>2T/S
did you mean 2s/t?

Anonymous 01/14/25(Tue)17:44:25 No.103896239

>>103896222
I sure hope no one is trying to run models on ddr4

Anonymous 01/14/25(Tue)17:46:12 No.103896257

>>103896192
>all search results will be filled almost entirely with pure AI websites
No? How does the fact that it's AI-generated mean that it's not possible to filter it? Google doesn't have an incentive to allow spam.

Anonymous 01/14/25(Tue)17:46:18 No.103896259

https://x.com/sara21222122/status/1879000485077922017
Ai mogging artqueers yet again

Anonymous 01/14/25(Tue)17:46:29 No.103896261

>>103896239
y-y-yeah, m-me too h-hahaha

Anonymous 01/14/25(Tue)17:47:30 No.103896273

>>103896192
Haha it's worse than that. I do SEO work at home as a freelancer.
So someone did a study and they found that webpages which have text that more closely matches the text of the AI Overview that Google generates for a specific keyword are more likely to be linked to in the AI Overview.
Those AI Overview links are considered prime by businesses because they're right at the top of the search results.
I'm sure you see where I'm going with this: more and more webpages are just going to have rewritten Google AI Overview content in them.

Anonymous 01/14/25(Tue)17:48:21 No.103896286

>>103895618
>>103895631
Thats nice didn't know it had API so i never looked into it

Anonymous 01/14/25(Tue)17:48:33 No.103896289

>>103896259
Kek, the salty artists going through the comments praising the image and screaming b-but its ai!

Anonymous 01/14/25(Tue)17:50:37 No.103896314

>>103896289
I'm not impressed.
At least inpaint the obvious AI gibberish on the hat so it's not fucking gibberish.

Anonymous 01/14/25(Tue)17:52:08 No.103896328

>>103896314
138K likes apparently disagree mr salty artist. That must burn lol

Anonymous 01/14/25(Tue)17:52:54 No.103896339

>>103896257
>Google doesn't have an incentive to allow spam.

Anonymous 01/14/25(Tue)17:52:57 No.103896340

>>103896328
No, I'm a genner.
I would never release anything with obvious AI gibberish text on it. That's low-effort bullshit. There's no excuse to not inpaint that.

Anonymous 01/14/25(Tue)17:55:47 No.103896378

>>103896273
Oh, and this does work, too.
The very first page I wrote for my client utilizing rewritten Google AI Overview content in it got linked to by the AI Overview as soon as it got indexed. My client was thrilled.
Everybody is going to do this shit so the result is the search results are going to get samier and samier.
As long as Google has a monopoly on searches, the influence of their AI Overview results is going to be absolutely fucking massive on internet content.

Anonymous 01/14/25(Tue)18:01:06 No.103896426

>>103896257
Google clearly just pretends to care about the quality of search results now. They release a "helpful content" update pretending to care about the quality of search results, but in reality they only care about Adwords dollars.

Anonymous 01/14/25(Tue)18:09:29 No.103896521

>>103896259
>that mangled text on the cap
Disgusting

Anonymous 01/14/25(Tue)18:09:58 No.103896527

Here is my eulogy for my old worldview

Claude Shannon hypothesized in the 1940s that all reasoning is is just language manipulation, and that by predicting language you could reason or have a process equivalent to the process of reasoning. He was highly ridiculed for this at the time and even as recently as a decade ago he was printed in textbooks quoted as being hilariously wrong, mostly by Noam Chomsky who held almost the polar opposite view of him. I myself also was on Chomsky's side.

In 2015 Andrej Karpathy trained a Recurrent Neural Network (RNN) on a large corpus of text showing that not only could it predict the next word rather accurately, if you let it generate the next word and then predict the word after the next it could make proper sentences with value. He also uncovered that there were sentiment neurons and other emergent reasoning abilities in this model. I read this paper at the time and while impressed never considered it to scale further.

Then in 2018 Ilya Sutskever had a brilliant spark. He saw the Karpathy paper, Saw Google's Transformer architecture (Only used by Google as NLP or encoder model) and combined the two creating GPT (GPT-1). I remember reading about GPT at the time and not being impressed.

Only when GPT-2 released in 2019 did I take this development truly serious as the paper showed that you can just continue scaling and the emergent capabilities would just continue appearing without end. I was highly skeptical but realized this was the future of Machine Learning.

Throughout all of this I never thought this would result in AGI let alone ASI at all. Text is limited in informational value and even then it only contains data subpar to the baseline human in aggregate, right?

Wrong. Claude was right, Ilya Sutskever was right. It just took me a long while to get my head straight.

2025 might be the last year where humanity is the smartest entity on Earth.

Anonymous 01/14/25(Tue)18:10:50 No.103896536

>>103896527
too long; did not read

Anonymous 01/14/25(Tue)18:11:46 No.103896542

>>103896521
142K likes and counting. What a glorious salt mine. No I did not make it I just like it when artists are faced with the fact that the majority either cant tell or dont care.

Anonymous 01/14/25(Tue)18:13:55 No.103896567

>>103896542
I'm a genner and all aboard the AI train and it does grind my gears that the majority don't care about that mangled text.
I takes a minute or two to inpaint/shoop out. It's fucking lazy and it grinds my gears that people don't care that it's fucking lazy.

Anonymous 01/14/25(Tue)18:15:36 No.103896588

>>103896527
>Claude Shannon hypothesized in the 1940s that all reasoning is is just language manipulation
Well yeah, Orwell figured that one out. 1984. If you control language, you control how people think. People can't think about certain concepts if the words for those concepts don't exist.

Anonymous 01/14/25(Tue)18:15:50 No.103896590

>>103896567
>grind my gears
>grinds my gears
Turn up the rep pen, grinds my gears when people do that kind of shit. Its fucking lazy.

Anonymous 01/14/25(Tue)18:16:36 No.103896598

>>103896590
I did not care for your post. It insists upon itself.

Anonymous 01/14/25(Tue)18:18:31 No.103896619

>>103896527
Meanwhile o1, the smartest model, still makes easily avoidable mistakes in non-standard coding tasks and doubles down when pointed out. It's just an overglorified autocomplete.

Anonymous 01/14/25(Tue)18:20:18 No.103896638

>Here is my eulogy for my old worldview
>
>Claude Shannon hypothesized in the 1940s that all reasoning is is just language manipulation, and that by predicting language you could reason or have a process equivalent to the process of reasoning. He was highly ridiculed for this at the time and even as recently as a decade ago he was printed in textbooks quoted as being hilariously wrong, mostly by Noam Chomsky who held almost the polar opposite view of him. I myself also was on Chomsky's side.
>
>In 2015 Andrej Karpathy trained a Recurrent Neural Network (RNN) on a large corpus of text showing that not only could it predict the next word rather accurately, if you let it generate the next word and then predict the word after the next it could make proper sentences with value. He also uncovered that there were sentiment neurons and other emergent reasoning abilities in this model. I read this paper at the time and while impressed never considered it to scale further.
>
>Then in 2018 Ilya Sutskever had a brilliant spark. He saw the Karpathy paper, Saw Google's Transformer architecture (Only used by Google as NLP or encoder model) and combined the two creating GPT (GPT-1). I remember reading about GPT at the time and not being impressed.
>
>Only when GPT-2 released in 2019 did I take this development truly serious as the paper showed that you can just continue scaling and the emergent capabilities would just continue appearing without end. I was highly skeptical but realized this was the future of Machine Learning.
>
>Throughout all of this I never thought this would result in AGI let alone ASI at all. Text is limited in informational value and even then it only contains data subpar to the baseline human in aggregate, right?
>
>Wrong. Claude was right, Ilya Sutskever was right. It just took me a long while to get my head straight.
>
>2025 might be the last year where humanity is the smartest entity on Earth.

Anonymous 01/14/25(Tue)18:20:30 No.103896639

>>103896527
So you think it'll keep scaling without end? Firstly, there is a data limit (we literally ran out of easily available data), inbreeding with synthetic data that poisoned the well, exponential cost of running the model for barely any performance improvement. The only thing that LLM proved is that language can be modeled and predicted given enough compute and data (basically bruteforce the 'laws' of the written language). Wake me up when that tool gets some auto-determination.

Anonymous 01/14/25(Tue)18:21:51 No.103896656

>>103896619
I have those issues too, yet somehow people constantly tell me it can do all their programming work. I don't get it.

Anonymous 01/14/25(Tue)18:21:54 No.103896657

>>103896639
>So you think it'll keep scaling without end?
Not him. It certainly won't as long as people are too fucking stupid (or malicious) to build more nuclear power plants.

Anonymous 01/14/25(Tue)18:24:32 No.103896676

>>103896139
Because they are all trained on ChatGPT. Alpaca was a disaster to LLMs that they field has not recovered from.

Anonymous 01/14/25(Tue)18:28:05 No.103896723

>>103896676
Why do they do that? Laziness? I alone have many GB of human generated data, there must be so much available if they just sort it into a usable format.

Anonymous 01/14/25(Tue)18:28:38 No.103896736

>>103896639
>there is a data limit
There isn't a data limit, the grokking paper refuted that. QRD: you can train a sufficiently trained LLM on the same data and it will reason about the data during training to such an extent that it will uncover new relations and truly understand the underlying logic. This is also the main reason why synthetic data sets work. It's a way for the AI to see the same data over and over in slightly different variations to force it to put 2 and 2 together and understand the underlying rules. Data isn't the bottleneck people thought it would be.

>exponential cost of running the model for barely any performance improvement
"Intelligence per FLOP" is doubling every 3.3 months from 2022 to 2025. Showing that not only are we getting more efficient. We're getting efficient at a faster rate than the models are getting bigger, essentially closing the gap. Language being modeled and predicted is according to Claude Shannon mathematically equivalent to abstract reasoning.

auto-determination is essentially just agentic behavior in a loop. That's a software implementation away, not a model issue.

Anonymous 01/14/25(Tue)18:29:15 No.103896741

>>103896723
I assume also liability. ChatGPT is less likely to have NSFW or copyrighted material that might get them into trouble.

Anonymous 01/14/25(Tue)18:30:13 No.103896755

>>103896656
>people constantly tell me it can do all their programming work
Because it's true.
I hire programmers so I see the types of people looking for work. Most of them are worse than 3.5. They used to copy stuff from medium tutorials and change colors and padding in css until it resembles what they want. Now they ask chatgpt to do the same but faster.

Anonymous 01/14/25(Tue)18:32:23 No.103896782

>>103895084
try unslop-nemo or magpicaro

Anonymous 01/14/25(Tue)18:32:42 No.103896787

>>103896741
I suppose, since it's logs from years and years finding the people for permission would be restrictive.

Anonymous 01/14/25(Tue)18:35:03 No.103896808

>>103896755
Oh well if it's css, that makes more sense. I was trying to do stuff with number theory algorithms. I don't have a job but I assumed professionals with jobs would be better than me.

Anonymous 01/14/25(Tue)18:36:23 No.103896819

>>103896808
Most """programmers""" are javascript """programmers"""

Anonymous 01/14/25(Tue)18:36:30 No.103896820

>>103896741
Why doesn't OpenAI sue everyone who uses their cuckbot generated data? Do they simply prefer to keep people using their models to generate data?

Anonymous 01/14/25(Tue)18:36:31 No.103896821

>>103890930
Yeah sorry that was dumb. I was tired when I posted that.
Basically, I wanted to get it running using their web demo locally to test its features. I noticed in the chatbot tab it was still connected to the chinese server and the call and video call tabs just threw errors. I looked around in the settings and realized it wasn't connecting to the local instance. So I changed those settings and despite it inferencing and heating up my gpu, the outputs all gave key errors.

I can't really be more specific because I deleted the whole thing and gave up after that but voice and video was not working on the local demo they provided.

Anonymous 01/14/25(Tue)18:36:36 No.103896822

>spicy招牌
deepseekv3......

Anonymous 01/14/25(Tue)18:37:33 No.103896840

>>103896819
But if they have a CS degree they'd know lots more about algorithms than me, I only have the mathematics degree.

Anonymous 01/14/25(Tue)18:38:22 No.103896849

>>103896822
>deepseek

Anonymous 01/14/25(Tue)18:40:29 No.103896879

>>103895489
>UnslopNemo they said
It is just you being a dumb anon falling for marketing. He even said it in his model description that he just unsloped his dataset. Even if he actually did that, finetuning on not slopped data doesn't mean you will not get slop.

Anonymous 01/14/25(Tue)18:41:17 No.103896888

>>103896879
>finetuning on not slopped data doesn't mean you will not get slop.
wot

Anonymous 01/14/25(Tue)18:41:49 No.103896895

>>103896736
Don't get me wrong, I'm not against AI (I train a lot of small models on specific tasks). However, you can't say that synthetic data is as good as human data. It only focuses on the weakest link between things, which is why LLMs often struggle with non-linear thinking like sarcasm or explaining jokes. It's more 'digestible' for LLMs since it's made by LLMs, but it won't get better that way. Every company that releases open-source models makes that mistake. They train on GPT data thinking they'll catch up with the OAI last model that way, but you won't be able to get anything else except a poor copycat of GPT, let alone something equivalent or better.

Anonymous 01/14/25(Tue)18:43:21 No.103896914

>>103896808
I'm a /g/ tier hobbyist while my wife is a fullstack engineer that works at Google. Her "code" is atrocious and she constantly asks for my help debugging or reasoning through her problems. I don't even know Javascript or her stack it's just problem solving and letting her write the implementation. I don't have a degree though and learned everything myself through sheer determination and unbridled autism while she graduated with her masters degree with full honors from a prestigious uni and therefor she got the job and I didn't. The entire economy is a clownshow and you realize that capitalism doesn't work when competent people don't get the job but predictable mediocre people do.

Oh and o1 can still not fix both my wife's issues as well as mine (mostly cpp/rust AI/ML stuff)

Anonymous 01/14/25(Tue)18:45:06 No.103896937

>>103896888
he's saying that since the base has slop tuning on non-slop means you'll still get some slop, maybe a little less

Anonymous 01/14/25(Tue)18:46:15 No.103896948

>>103896914
She only got a job because of DEI, no one would give her the time of the day otherwise.

Anonymous 01/14/25(Tue)18:47:06 No.103896960

>>103895571
unslop nemo is mistral v3 tekken or pgymalion format

Anonymous 01/14/25(Tue)18:49:04 No.103896985

cropped__hatsune_miku_and_kasane_teto_vocaloid_and_1_more_drawn_by_yuta_2341__69885e3eddfcf38c4a45b3642f30cc2b

>>103896969
>>103896969
>>103896969

Anonymous 01/14/25(Tue)18:56:56 No.103897087

>>103896985
Teto Teto, Kasane Teto

Anonymous 01/14/25(Tue)19:26:21 No.103897428

>>103896822
I hope "smash" just means graffiti a DESA seal on it. If you LIKE spicy foods you shouldn't destroy spicy food places or that'll antagonize them.