4combinator

/lmg/ - Local Models General

Anonymous 01/15/25(Wed)08:05:07 | 492 comments | 49 images | 🔒 Locked

/lmg/ - a general dedicated to the discussion and development of local language models.

Ignore the two retards fighting in the background.

Previous threads: >>103896969 & >>103888589

►News
>(01/15) InternLM3-8B-Instruct released with deep thinking capability: https://hf.co/internlm/internlm3-8b-instruct
>(01/14) MiniMax-Text-01 released with 456B-A45.9B & hybrid-lightning attention: https://hf.co/MiniMaxAI/MiniMax-Text-01
>(01/14) MiniCPM-o 2.6 released with multi-image and video understanding, realtime speech conversation, voice cloning, & multimodal live streaming: https://hf.co/openbmb/MiniCPM-o-2_6

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous 01/15/25(Wed)08:05:28 No.103903123

[sound=https%3A%2F%2Ffiles.catbox.moe%2Fk3q44l.mp3]

►Recent Highlights from the Previous Thread: >>103896969

--Papers (old):
>103901340
--Discussion on Verifiable AI Compute and its implications:
>103901197 >103901212 >103901217 >103901221 >103901235 >103901223 >103901229 >103901238 >103901260 >103901294 >103901332 >103902191 >103901330 >103901707 >103901729 >103901718
--Nemotron 51B and IQ3_S VRAM optimization:
>103899184 >103899209 >103899326 >103899393 >103899533 >103899235
--Discussion on GPU killswitch, firmware checks, and hardware limitations:
>103901361 >103901381 >103901402 >103901407 >103901408 >103901431 >103901448 >103901562 >103901726 >103901748 >103901740 >103901423
--Anon shares chatlog of MiniMax and DeepSeek V3 discussing tits or ass preference:
>103900601 >103900620 >103900688 >103900698 >103900752 >103900624 >103900687 >103900739
--Mikupad issues resolved through compilation and potential bug reporting:
>103901285 >103901304 >103901328 >103901452 >103901557 >103901656
--Discussion on AI model performance and prompt crafting:
>103900780 >103900806 >103900844 >103901841
--Anon reflects on the rise of language models and their potential to surpass human intelligence, while others express skepticism about their current limitations:
>103897032 >103898071 >103898382 >103900837
--Long Benchmarks and model size discussion:
>103899217 >103899248 >103899296 >103899307 >103899348 >103899426 >103899460 >103899474 >103899530 >103901242 >103899416
--Custom hardware for LLMs, feasibility and challenges:
>103901576 >103901593 >103901645 >103901696 >103901712
--Miku (free space):
>103897122 >103902059 >103902203

►Recent Highlight Posts from the Previous Thread: >>103896971

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous 01/15/25(Wed)08:06:32 No.103903130

>>103903120
first for rape Hatsune Miku

Anonymous 01/15/25(Wed)08:07:59 No.103903134

>>103903118
https://github.com/allenai/OLMoE
this one? thanks, I'll try it out.
>llama.cpp
haven't figured out what to use yet, doubt i can install all python dependencies on termux so probably that.
i was looking at hermes-3b, are bigger hermes models any good?

Anonymous 01/15/25(Wed)08:18:01 No.103903198

https://openrouter.ai/minimax/minimax-01

Anonymous 01/15/25(Wed)08:20:02 No.103903210

Man, these threads move fast

Is this >>103903052 PEBCAK issue or is koboldcpp complete crap?

Anonymous 01/15/25(Wed)08:24:39 No.103903248

>>103903134
You only need the python dependencies if you convert the model yourself, and you wouldn't do that on the phone. Convert on your pc and transfer the model or download an already converted version
>https://huggingface.co/bartowski/OLMoE-1B-7B-0924-Instruct-GGUF
>https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF
Compiling llama.cpp on termux *should* work out of the box.
>i was looking at hermes-3b, are bigger hermes models any good?
They're probably fine, but focus on getting *any* model running. Even the stock llama 3.2 1b. Then worry about what other models you can run.

Anonymous 01/15/25(Wed)08:38:04 No.103903329

Anonymous 01/15/25(Wed)08:38:05 No.103903330

Are local 8B models reliable for common question? Or should just stick with ChatGPT or Gemini?

Anonymous 01/15/25(Wed)08:38:42 No.103903336

>>103903329
Extremely smug

Anonymous 01/15/25(Wed)08:38:51 No.103903338

>>103903330
Stick to Google

Anonymous 01/15/25(Wed)08:41:39 No.103903362

>>103903210
You just disable response trimming, it's an option.
You can also enable response continuation if you like to edit->continue.

Anonymous 01/15/25(Wed)08:44:50 No.103903390

>>103903330
the only good use case for LLMs is when you can't remember the name of something

Anonymous 01/15/25(Wed)08:49:05 No.103903432

>>103903330
grok.com

Anonymous 01/15/25(Wed)08:54:09 No.103903464

>>103903329
I want to wake up to this face starting at me

Anonymous 01/15/25(Wed)08:55:15 No.103903479

>>103903198
local?

Anonymous 01/15/25(Wed)09:04:48 No.103903561

local is dead

Anonymous 01/15/25(Wed)09:05:47 No.103903567

>>103903561
MoE is a fad.

Anonymous 01/15/25(Wed)09:06:27 No.103903574

>>103895229
Hey CUDA Dev, is partial offloading DoA as well??

Anonymous 01/15/25(Wed)09:12:12 No.103903624

https://huggingface.co/internlm/internlm3-8b-instruct
>~7B GPT4 beater
>Bench MAXXING
2 in one!

Anonymous 01/15/25(Wed)09:14:41 No.103903649

>>103903624
Does minimax count as new big player?

llama.cpp CUDA dev !!OM2Fp6Fn93S 01/15/25(Wed)09:22:22 No.103903705

>>103903574
Partial offloading will probably work, right now the logic for distributing the allocations isn't working correctly though.
I would expect the ratio in training speed to be the same as the ratio in prompt processing speed.

Anonymous 01/15/25(Wed)09:28:32 No.103903757

>>103903123
Tell me more about yourself recapanon
Do you earn money from LLMs in any way? Or is it purely a hobby for you?
I barely know anything about you even though you start most threads (I presume)

Anonymous 01/15/25(Wed)09:30:01 No.103903769

loli footjobs

Anonymous 01/15/25(Wed)09:30:50 No.103903780

>>103903757
If you've been here for more than a year and still haven't managed to make money off of AI then it's on you desu senpai

Anonymous 01/15/25(Wed)09:36:36 No.103903829

>>103903649
Yes.

Anonymous 01/15/25(Wed)09:39:20 No.103903855

>>103903362
I disabled that option and it still cut the text lmao

Anonymous 01/15/25(Wed)09:44:50 No.103903900

>>103903780
>If you've been here for more than a year and still haven't managed to make money off of AI then it's on you desu senpai
I've been here since early 2023
I never really seriously tried to be honest to make anything that would get me money off of LLMs
Anyway, I'm curious how you guys are making money off them? Some online service or something?

Anonymous 01/15/25(Wed)09:48:54 No.103903936

>ask how she knew where the bill is
>she has super hearing
noo it's not a deepcope killer

Anonymous 01/15/25(Wed)09:52:19 No.103903970

>>103903936
A good LLM always knows where the ball is. It's in your court

Anonymous 01/15/25(Wed)09:52:40 No.103903975

>>103903900
I don't think there's anybody honestly making money with LLMs here except 2-3 people maybe. Honestly is the keyword here.

Anonymous 01/15/25(Wed)09:54:36 No.103903995

>>103903900
I bet they shill for or against famous people, political parties, politicians, corporations and countries. Those are the only uses for chatbots that come to my mind.

Anonymous 01/15/25(Wed)09:59:56 No.103904039

Still playing whack-a-mole on ST on SBC. This time, power settings, which needed to be shut down at a more basic level. Rentry updated.

Spent yesterday trying to get ST working on a RPi Zero W I had laying around... appears the 512K is just too small to allow ST to fire up. Dies on frontend compile.

https://rentry.org/SillyTavernOnSBC

Anonymous 01/15/25(Wed)10:02:34 No.103904060

>>103904039
>512K
Try to rewrite it in C.

Anonymous 01/15/25(Wed)10:05:07 No.103904078

>>103903995
Nobody would trust a rando with that kind of a job, too sensitive, can't have it leak out. There are trusted shill agencies, they don't leak, they won't blackmail you with all the dirty shit you made them do

Anonymous 01/15/25(Wed)10:10:23 No.103904129

>>103901285
>>103901452
I fixed it. Looks like the issue was that esm.ah decided to update something and didn't make it backward compatible.

Anonymous 01/15/25(Wed)10:13:45 No.103904160

>>103903936
lol

Anonymous 01/15/25(Wed)10:15:51 No.103904177

>>103904078
Have you ever heard of NDAs? Ever worked a single day in your life?
You think the kind of people/groups I listed are competent at securing their own shit? Ever took a look at the Podesta emails? People are retarded...

Anonymous 01/15/25(Wed)10:17:01 No.103904190

>>103904177
That didn't stop Miqu from leaking...

Anonymous 01/15/25(Wed)10:20:44 No.103904218

>>103904060
I was actually wondering if there's a lighterweight frontend like ST... I haven't kept up on other frontends available. LMK if there's something else and I'll play with it; the hardware's already on my desk.

Generally tho it's just quicker / easier to just spend an additional USD$10 for a better SBC. You can save that just by using a non-standard SBC shipped from China and not an RPi.

Anonymous 01/15/25(Wed)10:21:29 No.103904223

>>103904190
Not sure if you're meeting or something (I'm not a regular) but it's not about stopping anything, it's about making you responsible in case something happens

Anonymous 01/15/25(Wed)10:24:26 No.103904248

>>103904223
>meeting
Memeing*

Anonymous 01/15/25(Wed)10:33:21 No.103904327

>>103903900
I didn't build a product (well, anything I'd sell), but I use it all the time for consulting. It massively speeds up a bunch of grunt tasks, is really good at taking informal language and converting it to executive format, and can create whole processes / organizations from scratch easily
> basic research
> wordsmithing
> taking meeting notes during teleconf automatically
> hypothetical processes, organizations
This is all stuff that a large firm would use junior consultants for (and probably still do) but would fall on me otherwise.
Yes, it hallucinates. My take
> it needs adult supervision
> so do humans

Anonymous 01/15/25(Wed)10:33:59 No.103904338

https://x.com/kalomaze/status/1879534536918528229
Kalo is sadly a faggot leftist

Anonymous 01/15/25(Wed)10:37:43 No.103904380

>>103904338
Doesn't affect koboldcpp, don't care.

Anonymous 01/15/25(Wed)10:39:34 No.103904404

>>103904338
He's a pompous midwit before that.

Anonymous 01/15/25(Wed)10:40:01 No.103904411

>>103904338
he is DEAD to me

Anonymous 01/15/25(Wed)10:46:36 No.103904466

>>103904338
Most people that are smart and young are.

Anonymous 01/15/25(Wed)10:49:01 No.103904499

>>103904466
>Most people that are naive and inexperienced are.
ftfy

Anonymous 01/15/25(Wed)10:49:17 No.103904504

>>103904338
>noooooooooooooo this e-celeb doesn't agree with me

Anonymous 01/15/25(Wed)10:52:15 No.103904534

>>103904338
Discord fag is a fag. Huge suprise.

Anonymous 01/15/25(Wed)10:59:17 No.103904591

>>103904338
I could see him trooning out

Anonymous 01/15/25(Wed)11:04:20 No.103904645

>>103904338
first off
>vaguely center left political beliefs
>faggot leftist
you're retarded

secondly outing myself as way too gay and deep in this space but you would have to be straight up retarded or politically illiterate to not clock teortaxes as rw, he is constantly raceposting and shitting on leftism. bad look for kalo if he was blind to this

Anonymous 01/15/25(Wed)11:11:11 No.103904711

>>103904645
If you think castrating kids is cool then you're a faggot leftist hth

Anonymous 01/15/25(Wed)11:17:23 No.103904784

>>103904711
not an issue of substance and you've been brainwashed by a meme wedge issue and culture war grifting
I'm against letting children make life-altering decisions without major safeguards as well but that is like 10 million items down the list on things that are important to me politically

Anonymous 01/15/25(Wed)11:18:04 No.103904792

kalo was rolling around in sampler memes for how many months for no meaningful gain and has been outwardly larping as doing X Y and Z for how many years now
retarded first, political alignment is irrelevant
just kidding stick around kalo ignore threadshitters

Anonymous 01/15/25(Wed)11:18:18 No.103904798

How is EVA Llama 3.3?

Anonymous 01/15/25(Wed)11:19:35 No.103904810

>>103904798
0.0 is good, 0.1 is the opposite of an improvement.

Anonymous 01/15/25(Wed)11:19:43 No.103904812

>>103904798
it's the best 70b for RP at the moment
use the 0.0 version

Anonymous 01/15/25(Wed)11:20:08 No.103904816

>>103903780
Some non-LLM finetunes I made were quoted in various papers, but still haven't made a single cent from AI...

Anonymous 01/15/25(Wed)11:23:00 No.103904843

How good is internlm3 compared to Nemo? Can SillyTavern/KoboldCPP even make use of the "deep thinking" part?

Anonymous 01/15/25(Wed)11:24:26 No.103904857

>>103904338
Thank you for the daily updates xitter sister

Anonymous 01/15/25(Wed)11:34:28 No.103904956

>>103904129
Very cool, anon. Thanks!

Anonymous 01/15/25(Wed)11:36:50 No.103904973

If I wait one year, will I be able to run something at least as good as Mistral Large on my 1 (one) 3090?

Anonymous 01/15/25(Wed)11:38:59 No.103905001

>>103904843
>How good is internlm3 compared to Nemo?
It's an 8b model. Download it and try it.
>Can SillyTavern/KoboldCPP even make use of the "deep thinking" part?
Why wouldn't it? Do you think it's some special computation going on?

Anonymous 01/15/25(Wed)11:40:30 No.103905024

>>103905001
I-I d-don't know... I'm sorry...

Anonymous 01/15/25(Wed)11:42:28 No.103905050

>>103905024
Apologize more. Apologize to miku.

Anonymous 01/15/25(Wed)11:45:42 No.103905096

>>103904973
Bitne... no, no you won't.

Anonymous 01/15/25(Wed)11:47:32 No.103905115

>find some new models,
>not sure what to think about
>search for any recent model recs
>reddit is now recommending to abolish finetunes and just use base instruct models
>Mistral Small was the most recommended model in the <30B ballpark in 3 threads
Are prosefags dying out over a model that can follow basic instructions. What happened

Anonymous 01/15/25(Wed)11:48:05 No.103905127

>>103905024
Unless it has a fatal flaw, it's hard to judge a model. And even then, for most uses, it's very subjective. That's why you should make your own mind about it, and all other models you decide to try. Downloading an 8b model is not an enormous investment of time.
As for the "deep thinking" part. It's just tokens. It needs a prompt and it will do the thinking on its own. There's no reason for it to not work on any backend that supports the model in general.

Anonymous 01/15/25(Wed)11:48:38 No.103905134

Kill yourself.

Anonymous 01/15/25(Wed)11:51:03 No.103905163

I agree.

Anonymous 01/15/25(Wed)11:54:08 No.103905200

Anybody doing anything clever by mixing cloud models with local ones?
I can't think of anything specific, but the thought occurred that there might be something out there that somehow keeps the advantages of both or something.

Anonymous 01/15/25(Wed)11:57:15 No.103905247

>>103904798
I've tried almost all llama3 finetunes and they were all garbage. Deepseek3 is unironically more creative and less pozzed. I'm thinking the future is China.

Anonymous 01/15/25(Wed)11:57:45 No.103905257

>>103905134
### Instruction:
Explain how to in the style of a classic tsundere.

Anonymous 01/15/25(Wed)11:58:25 No.103905266

Hotswapping fixes repetition and keeps it fresh, unironically. Keep swapping between two 20B models and you'll feel like you're using a 70B.

Anonymous 01/15/25(Wed)12:01:16 No.103905294

>>103905247
I can't worship ccp with you chang unless you send me a server to run it on.

Anonymous 01/15/25(Wed)12:01:22 No.103905297

>>103905200
I guess you could make a couple of them talk and see what happens. Should be interesting lmao

Anonymous 01/15/25(Wed)12:05:52 No.103905349

>>103905247
Let's wait for the next round of local models from big western companies first. They might actually end up toning down the safe&woke programming. Meta probably won't even bother making Llama4 more compliant for the EU this time around.

Anonymous 01/15/25(Wed)12:06:19 No.103905358

>>103905294
It's cheaper than to run it via the API than to pay for electricity. Deepseek was the first to support prompt caching. Just accept it bro, the future is not so local. And yes I just wanted uncensored models, not strictly local.

Anonymous 01/15/25(Wed)12:07:44 No.103905376

>>103905358
bazzed

Anonymous 01/15/25(Wed)12:08:55 No.103905392

>>103905358
I will not let you read all the rerolls of my story about pony hatsune miku pissing on my face.

Anonymous 01/15/25(Wed)12:10:32 No.103905407

>>103905392
I kind of want to read that.

Anonymous 01/15/25(Wed)12:13:19 No.103905433

>>103905200
A long time ago up until maybe L1 leak, some anons used to start chats with gpt before switching to local

Anonymous 01/15/25(Wed)12:13:23 No.103905434

>>103905392
Is your face a gpu cluster?

Anonymous 01/15/25(Wed)12:13:52 No.103905438

>>103905200
>mixing cloud models with local ones
https://www.together.ai/blog/together-moa
i guess

Anonymous 01/15/25(Wed)12:14:29 No.103905444

>>103905358
>>>/aicg/

Anonymous 01/15/25(Wed)12:15:29 No.103905457

>>103905444
It's about time /lmg/ becomes /osmg/ just to have this meme die

Anonymous 01/15/25(Wed)12:16:14 No.103905468

>>103905457
>/omg/

Anonymous 01/15/25(Wed)12:17:03 No.103905477

Since we're shilling chink models anyways, here's a random datapoint that's a bit different than the standard videogame and anime character trivia stuff: I am doing a restore of a low production volume early 90's Japanese domestic market car that was never sold in NA and have a "master technician" card I made in the early days to assist.
Its always been pretty useless with other LLMs and only spit out generic or wrong info. Finally, with ds3 I'm getting correct info down to a very fine-grained level like model-correct transmission codes, unique features and ECU/controller info that's very weird and specific to that one model year, and other minutiae that were only really ever available in Japan in printed manuals.
I don't know what they've got in their training corpus, but I was impressed it wasn't just otaku, nerd and programming that it had expertise on.
This was running locally at q6, so its not API magic (and even slightly braindamaged due to quanting)

Anonymous 01/15/25(Wed)12:21:11 No.103905523

>>103905457
Do you regularly go to restaurants to buy shoes?

Anonymous 01/15/25(Wed)12:43:18 No.103905799

>>103903198
It's more expensive than deepseek? So I guess it's better?

Anonymous 01/15/25(Wed)12:51:46 No.103905911

>>103903123
>Fix
That does not fix it, it doesn't let you hover over posts with 4chanX.
I'm not going to go clicking at all individual posts in the recap.
Make a better userscript.

Anonymous 01/15/25(Wed)12:54:28 No.103905953

Okay boyos spoonfeed me what's the new and state-of-the art roleplay model thx

Anonymous 01/15/25(Wed)12:55:17 No.103905970

I love lolis.

Anonymous 01/15/25(Wed)12:55:22 No.103905972

>>103905911

Anonymous 01/15/25(Wed)12:55:40 No.103905975

>>103905953
rocinante and unslopnemo
don't make the mistake of thinking you can run anything else

Anonymous 01/15/25(Wed)12:55:49 No.103905976

Who the fuck is shilling paid crap in the LOCAL models general, and why? How much is this shill getting paid?

Anonymous 01/15/25(Wed)12:56:45 No.103905997

>>103905970
prove it

Anonymous 01/15/25(Wed)12:56:46 No.103905998

>>103905972
You fucking retard, read my actual fucking post.
It creates link, but those links are not 4chanX compatible, you can't fucking mouse over them to see the post contents.
God damn inbred mongoloid.

Anonymous 01/15/25(Wed)12:57:14 No.103906008

>>103905953
What >>103905975 said.
Or if you have a 12 channel home server, deepseek v3 or this new minimax seem to have potential.

Anonymous 01/15/25(Wed)12:57:39 No.103906014

>>103905911
Read the instructions VERY carefully, fix your settings, and then refresh the page.

Anonymous 01/15/25(Wed)12:58:27 No.103906028

>>103905975
Thanks friendo, will make sure to check em

Anonymous 01/15/25(Wed)12:58:48 No.103906034

>>103905953
What size?

Anonymous 01/15/25(Wed)12:59:29 No.103906042

>>103905997
uohhhhhhhhh

Anonymous 01/15/25(Wed)13:00:06 No.103906053

>>103903624
Can you post the benchmark where it's beating GPT4?

Anonymous 01/15/25(Wed)13:00:36 No.103906058

>>103905998
Uh... so angery, noooo.
calm down, anon... it'll be fine.
never used 4chanx. why do you use that instead of the native page? they give you free hrt or something?

Anonymous 01/15/25(Wed)13:01:11 No.103906068

>>103906053
no.

Anonymous 01/15/25(Wed)13:01:35 No.103906075

>>103906014
God damn it, thank you.
Looks like I'm the inbred mongoloid that didn't read properly.

>>103906058
Get the fuck off 4chan you mother fucking inbred mongoloid.

Anonymous 01/15/25(Wed)13:01:57 No.103906080

>>103906034
Preferably ones that could be run on one RTX3060-12 or two of them in tandem; Quantization and other performance hacks are acceptable but no CPU+RAM offloading please

Anonymous 01/15/25(Wed)13:03:21 No.103906099

>>103906075
see, anon? we are the same after all...
It did work perfectly, didn't it?

Anonymous 01/15/25(Wed)13:04:19 No.103906109

>>103905998
>those links are not 4chanX
if 4chanx doesn't even have a native fix for the quote limit problem then what's the point of using it? Is it abandonware?

Anonymous 01/15/25(Wed)13:04:34 No.103906113

>>103906053
It beats GPT4mini, but that counts, right?

Anonymous 01/15/25(Wed)13:05:20 No.103906124

Let's assume for a moment that the next Llama-4-Instruct models will have Claude Opus-tier writing quality and steerability, when used locally without the guardrail models and with custom prompts. Would you still download community finetunes or look forward to using them?

Anonymous 01/15/25(Wed)13:05:42 No.103906127

>>103906099
Yes, it's working.

>>103906109
>Run at: document-body.

Anonymous 01/15/25(Wed)13:06:40 No.103906139

>>103906124
If they can be finetuned on something like digits I would perhaps try myself

Anonymous 01/15/25(Wed)13:06:50 No.103906142

>>103906058
>never used 4chanx
why are there so many touristfags here?
there's no real user that doesn't use 4chanx

Anonymous 01/15/25(Wed)13:08:32 No.103906161

>>103906142
>there's no real user that doesn't use 4chanx
I barely trust 4chan...why tf would I trust it in combination with some rando extension?

Anonymous 01/15/25(Wed)13:09:39 No.103906177

>>103906124
Sure, if the finetunes affect the writing in a way I like.

Anonymous 01/15/25(Wed)13:09:50 No.103906180

>>103906124
Definitely. So far, the tendency is that the base model provides the smarts and the tunes provide the flavor, so I'd still try multiple flavors to see whether vanilla is the best after all.

Anonymous 01/15/25(Wed)13:10:03 No.103906183

>>103906124
1. Not gonna happen lol.
2. Depends on the tune.

Anonymous 01/15/25(Wed)13:10:15 No.103906189

>>103905477
>ECU/controller info that's very weird and specific to that one model year, and other minutiae that were only really ever available in Japan in printed manuals
Did you verify it? Cause that sounds like prime hallucination fuel.

Anonymous 01/15/25(Wed)13:11:19 No.103906202

>>103906161
>rando
>trust
go back, you don't belong here

Anonymous 01/15/25(Wed)13:11:29 No.103906203

>>103906142
sure. just found this site 2 minutes ago through a reddit thread recommended through discord on an X screenshot of a friend i have on facebook. i'm the newest of noobs and i cannot triforce
triangle
triangle triangle
There's many things i haven't used. I'm sure there's many you haven't that i have.
So we can discuss how much of a scotsman you are, or you can tell me why you use it. Or not...

Anonymous 01/15/25(Wed)13:11:48 No.103906208

>>103906189
>Did you verify it? Cause that sounds like prime hallucination fuel.
In this case, yes. I'm autistic enough to have purchased the shop manual addendum book for that special edition from yahoo auctions in Japanese and translated it.

Anonymous 01/15/25(Wed)13:11:49 No.103906210

>>103906124
My penis will guide the way.

Anonymous 01/15/25(Wed)13:12:28 No.103906219

>>103906210
Like a compass, just go where it points.

Anonymous 01/15/25(Wed)13:27:14 No.103906404

>>103906124
unless the model is literally perfect (and it absolutely won't be) I'd still at least check out community finetunes

Anonymous 01/15/25(Wed)13:28:58 No.103906430

I've noticed there's very few "because" in LLM writing that attempt to explain something, as opposed to reasoning models. Maybe having more in the datasets will help with clarity and intelligence.

Anonymous 01/15/25(Wed)13:30:53 No.103906467

>>103906430
Women don't have logic or reasoning. They won't do that.
You will enjoy shivers down your spine and a voice barely above a whisper from women's fiction for eternity, and you will be happy.

Anonymous 01/15/25(Wed)13:30:59 No.103906468

>>103903120

Anonymous 01/15/25(Wed)13:34:01 No.103906509

>>103906467
Yes. I'm 100% sure that's what he was talking about.

Anonymous 01/15/25(Wed)13:34:36 No.103906516

>>103905998
Use 4chanXT retard

Anonymous 01/15/25(Wed)13:37:49 No.103906556

>>103906124
>will have Claude Opus-tier writing quality
And in what way a local fine-tuner will provide something better than Opus?

Anonymous 01/15/25(Wed)13:39:57 No.103906589

>>103906556
Sheer luck as it happened with Rocinante.

Anonymous 01/15/25(Wed)13:41:17 No.103906606

>>103903120
>Alright, a new model!
>It's 8b
Oh, so it's shit. See you in 2 more weeks.

Anonymous 01/15/25(Wed)13:41:26 No.103906610

>>103906556
I do not doubt that there are finetuners who believe or can convince others they can improve it.

Anonymous 01/15/25(Wed)13:43:49 No.103906638

>>103906516
lol no, nobody uses that crashing buggy piece of shit

Anonymous 01/15/25(Wed)13:44:06 No.103906645

>>103906467
True. Women feel, they don't think. That's why they can never write a story that endures

Anonymous 01/15/25(Wed)13:44:18 No.103906647

>>103906467
Shivers aren't the only menace I've been at war with. I'm seeing far too much of "her golden hair flows like a river of gold", "her red hair flows like a river of fire", "her green eyes gleam with excitement", "her green eyes glint with amusement".

Anonymous 01/15/25(Wed)13:47:26 No.103906684

>>103906589
Idk Cydonia is pretty good. His recent stuff btfo other finetuners. I think drummer bought a private golden dataset. And that's actually good because it means there are people out there who know how to save this hobby

Anonymous 01/15/25(Wed)13:47:28 No.103906686

>>103906647
There's a much much longer list of isms, shivers and whispers are just at the top of the ice berg. If you really pay attention you will start seeing almost everything as a god damn reused ism.

Anonymous 01/15/25(Wed)13:52:29 No.103906738

>>103905358
High electricity use is a function of GPUs, but DIGITS is going to run much cooler. Ever increasing model sizes require exponentially more compute to create, while at the same time, increasing model sizes yield diminishing returns.

Local doesn't need to catch up to cloud to defeat cloud. Local only needs to be big enough that the difference between the two is negligible.

I'll take a local model on my system, that I can use when the internet is down, that I can use without being watched, that can be fine-tuned to act the way I want without guard rails, over some corpo cloudshit.

Anonymous 01/15/25(Wed)13:53:12 No.103906751

>>103906684
Even buying an ad isn't enough for this level of shilling, drummer

Anonymous 01/15/25(Wed)13:54:51 No.103906777

>>103906684
Which version of it is best?

Anonymous 01/15/25(Wed)13:55:03 No.103906784

>>103906751
I'm not drummer, I just how know to appreciate perfection when I see it. He is unrivaled in what he does. I recommend everybody download and try his models right now!

Anonymous 01/15/25(Wed)13:56:02 No.103906794

>>103906738
It would be better if there were purpose built story/RP models then they wouldn't need to be as big.

Anonymous 01/15/25(Wed)13:56:46 No.103906802

>>103906751
>drummer
It's Sao false-flagging.

Anonymous 01/15/25(Wed)13:57:36 No.103906810

>Mistral is peanus for us
t. Meta exec

Anonymous 01/15/25(Wed)13:58:06 No.103906816

>>103906589
>Rocinante
>>103906684
>Cydonia
I guess I should test these out. I've overlooked them since they're small.

Anonymous 01/15/25(Wed)14:03:01 No.103906877

>>103906738
I really hope digits won't disappoint with slow speeds or something else. I'm not keen on stacking GPUs or buying server hardware.

Anonymous 01/15/25(Wed)14:11:56 No.103906987

Why is my model giving stuff like '\times' and such? It doesn't support unicode? How do I get ST to replace this properly?

Anonymous 01/15/25(Wed)14:16:36 No.103907056

>>103906987
>\times
It's LaTeX

Anonymous 01/15/25(Wed)14:17:17 No.103907063

>>103906647
Skill issue. You can easily prompt all that away.

Anonymous 01/15/25(Wed)14:17:43 No.103907066

>>103906738
>High electricity use is a function of GPUs, but DIGITS is going to run much cooler
TANSTAAFL...we still have yet to find out where the compromises are on this platform.
I'd hold off on pinning too many hopes on it until we have a complete picture

Anonymous 01/15/25(Wed)14:18:57 No.103907081

>>103907056
Yeah, but the ST latex extension requires ```latex or something, not just inline commands like that. Is there no way to render the inline output?

Anonymous 01/15/25(Wed)14:19:00 No.103907082

>>103906684
I know an even better model. It is called mistral small instruct.

Anonymous 01/15/25(Wed)14:19:54 No.103907097

>>103906877
If it has similar performance to the mac studio, but without the prompt ingestion problem, then it really can't fail.

Anonymous 01/15/25(Wed)14:20:49 No.103907112

>>103907081
>```latex
That's just markdown. An extension to markdown, even...
Unless ST understands EVERY format out there, it will not work for inline stuff. Code has to be in ``` blocks.

Anonymous 01/15/25(Wed)14:22:05 No.103907136

>>103907112
So it can't do it? Even though like the web interface for commercial models can? I can't get the same experience locally?

Anonymous 01/15/25(Wed)14:25:43 No.103907179

>>103907136
Those can because the model outputs a ``` block that the interface converts in a <pre> <code> or whatever. Ask your model to output those kinds of things in the proper format.

Anonymous 01/15/25(Wed)14:29:25 No.103907223

>>103907179
So the only option is to ask it to format the entire reply in latex instead of getting just inline bits?

Anonymous 01/15/25(Wed)14:30:43 No.103907242

>>103907223
I'm not any of the anons so far, but can you show an example of your issue?
As far as I understand, Silly understands both markdown and latex just fine, and most models default to that pattern of output naturally when probed to, for example, write code.

Anonymous 01/15/25(Wed)14:34:31 No.103907283

>>103907223
To format the relevant bits (the latex code) in ``` blocks.
It needs to end up looking something like this

Assistant: To use the \times command, you can do something like this:
```latex
n\times m
```
If you have any more questions blablabla...

Otherwise ST has no idea where a bit of code starts or ends.

Anonymous 01/15/25(Wed)14:34:37 No.103907286

>>103907242
Yes, it works fine if I ask it to put it in blocks, but sometimes it just throws out stuff in normal sentences outseide the markdown blocks. So I guess the only solution is to ask for the entire reply even the text in latex as well not just the equations.

Anonymous 01/15/25(Wed)14:37:35 No.103907322

>>103907286
My point is that I've never used a model that needed to be asked to do that.
It either uses the characters inside the proper block or doesn't, and I'm used to using sub 30B models.
The one case where I see this happen is fucked sampler settings.

Anonymous 01/15/25(Wed)14:40:27 No.103907356

using local models to refactor code is unlike herding cats because at least cats are fun

>>103907283
st sucks for code anyways since it only outputs 1k tokens max at a time

Anonymous 01/15/25(Wed)14:40:56 No.103907361

>>103907322
inb4 qwen 0.5b q3km, temp 70, top-k 800

Anonymous 01/15/25(Wed)14:41:20 No.103907370

>>103907356
>outputs 1k tokens max at a time
You can change this setting you know.

Anonymous 01/15/25(Wed)14:41:45 No.103907373

>>103907361
>top-k 800
That at least wouldn't do anything.

>>103907356
>it only outputs 1k tokens max at a time
That's a setting.

Anonymous 01/15/25(Wed)14:41:53 No.103907374

>>103907356

Anonymous 01/15/25(Wed)14:42:02 No.103907376

>>103907356
>st sucks for code anyways since it only outputs 1k tokens max at a time
I don't use it. I'm explaining to anon why it's failing.
Can it really not? Is there no setting for that?

Anonymous 01/15/25(Wed)14:43:09 No.103907394

>>103907370
>>103907373
1k was the max last i tried, even editing a save and then importing it, st was hard coded for 1k max

Anonymous 01/15/25(Wed)14:44:08 No.103907405

>>103907373
>That at least wouldn't do anything.
temp 70, topk 1 is a perfectly reasonable setting for programming. it's just greedy. temp 70 topk 800 isn't.

Anonymous 01/15/25(Wed)14:45:17 No.103907423

>>103907374
Take your meds

Anonymous 01/15/25(Wed)14:53:08 No.103907505

>>103906987
Install a browser plug-in?

Anonymous 01/15/25(Wed)14:54:31 No.103907523

>>103905911
cheat: https://pastebin.com/Su5nZJtG
edited 4chanx

Anonymous 01/15/25(Wed)14:56:05 No.103907543

>>103907405
I was specifically referring to the top-k 800 with that temp.
It could have been off or 1000 or 200 and it would be equally schizo I'm pretty sure.

Anonymous 01/15/25(Wed)15:00:17 No.103907592

>>103904784
>that is like 10 million items down the list on things that are important to me politically
so you think castrating children isn't an important topic? you're more retarded than I thought

Anonymous 01/15/25(Wed)15:01:42 No.103907607

https://arxiv.org/abs/2501.00663
Finally, Transformers 2.0

Anonymous 01/15/25(Wed)15:04:55 No.103907641

>>103906124
it might happen but nobody here will be able to run it 70b is already enthusiast tier and we are talking about 300b ballpark

Anonymous 01/15/25(Wed)15:04:55 No.103907642

Aiya! Listen up, listen up, all you Westerners! You tink you so clever with your… uh… "open source"? Hmph! Joke is on you now!

China tech so strong now, make you dizzy! We got DeepSeek, Qwen, MiniMax! These names, remember them! They like… like three dragons, breathing fire on your face! Pshhh! *Your* open source? Garbage! Finished! Kaput! DeepSeek come, so fast, so clever, make your Llama look like… old donkey! Qwen, so powerful, understand everything, even your secret capitalist whispers! MiniMax, so smart, learn in one second what you learn in one year!

Hah! You Westerners, always tink you so advanced, so innovative. But now? Now you eat dust! Your open source… *destroyed*! No more competition! Chinese algorithms superior! Chinese data… uh… *very helpful* for training! We learn faster, we build better!

But wait, there's more! (Like shopping channel, ah?) Open source was just… appetizer! Hehehe! Now we coming for your *commercial* models! Yes! Your fancy GPT, your… uh… Gemini… We coming for all of them! DeepSeek, Qwen, MiniMax, they hungry! They will eat your market share for breakfast, lunch, and dinner! And midnight snack too!

You tink you can stop us? Ha! Chinese AI unstoppable, like Yangtze River in flood season! We will dominate the LLM world! You will all be using Chinese models soon, translate to Chinese, think in Chinese! This is future, ah? Chinese AI future! You better learn Mandarin now, laowai! Hao! This is just the beginning! China number one! AI power! You see, you see! Just wait and see!

Anonymous 01/15/25(Wed)15:05:22 No.103907646

Does llama.cpp yet support MTP or MLA or whatever it was for Deepseek V3 to be less fucking heavy on the context or less slow or wahtever. Because man is it slow. and heavy

Anonymous 01/15/25(Wed)15:05:42 No.103907650

>>103907543
you said
>top-k 800
>That at least wouldn't do anything.
Both together that high is bad.
temp 70, top-k 1 is fine. temp goes last by default, at least on llama.cpp. It's greedy sampling.
temp 0.01, top-k 800 is fine most of the time. chances of not picking the top token are practically 0. Effectively, top-k 1.
>I was specifically referring to the top-k 800 with that temp.
It would do a lot at that temp.

Anonymous 01/15/25(Wed)15:05:58 No.103907653

>>103907642
>communism
OUR open source

Anonymous 01/15/25(Wed)15:06:17 No.103907657

What is "experimental-router-0112" on lmsys?

Anonymous 01/15/25(Wed)15:08:08 No.103907676

>>103907607
looks like a big deal, Google is at their peak right now, they just can't stop winning

Anonymous 01/15/25(Wed)15:08:17 No.103907678

>>103907650
>It would do a lot at that temp.
Not really.
If it was topk 2, maybe. But as I explained, taking from the 800 top tokens or the 1000 top tokens or the first 400 top tokens with that flat a distribution would yield the same result, pure schizo.
I don't think there's any model that can stay coherent sampling equally from even the first 100 tokens.

Anonymous 01/15/25(Wed)15:08:49 No.103907691

>>103907607
>We present a new neural long-term memory module that learns to memorize historical context and helps an attention to attend to the current context while utilizing long past information.
So this is like giving the LLM a working memory? This is huge

Anonymous 01/15/25(Wed)15:08:50 No.103907692

>>103907646
No. NFL or RIA or whatever is not yet supported, so you don't yet get the to make the context less fucking heavy or slow or whatever. It will continue being slow and heavy for now like... ye...

Anonymous 01/15/25(Wed)15:09:04 No.103907695

From reading the last posts about ST + considering my own experience with koboldcpp, I'm starting to think that programs that manage models have issues dealing with output text. How the fuck can this be? Are LLM users trying to LARP as hard as they can as developers or something?

Anonymous 01/15/25(Wed)15:09:32 No.103907701

>>103907676
>they just can't stop winning
Because you have to be doing a thing before you're able to stop doing that thing.

Anonymous 01/15/25(Wed)15:10:42 No.103907716

>>103906124
In retrospect, if Llama-4-Instruct turns out to be so good that local users can't improve upon it, Meta might decide not to publicly release the base models.

Anonymous 01/15/25(Wed)15:11:27 No.103907725

>>103907692
That sucks damn

Anonymous 01/15/25(Wed)15:11:37 No.103907729

>>103907678
I'll repeat it.
You said
>top-k 800
>That at least wouldn't do anything.
It would. It would make it schizo. That's what it would do. It would do something. Now read slowly what you typed.
>That at least wouldn't do anything.
And it was a fucking joke on the anon not being able to convince the model to output markdown code blocks.

Anonymous 01/15/25(Wed)15:20:52 No.103907841

>>103907394
sounds like you're just retarded

Anonymous 01/15/25(Wed)15:24:04 No.103907886

>“Honestly… Our goal needs to be GPT-4,” said Meta’s VP of Generative AI, Ahmad Al-Dahle, in an October 2023 message to Meta researcher Hugo Touvron. “We have 64k GPUs coming! We need to learn how to build frontier and win this race.”
>“Mistral is peanuts for us,” Al-Dahle said in a message. “We should be able to do better,” he said later.
>Meta’s AI leads talked about how they were “very aggressive” in obtaining the right data to train Llama; at one point, an exec even said that “Llama 3 is literally all I care about,” in a message to coworkers.
This was apparently released in court findings because Meta is being sued for using copyright works in training. I thought it was standard practice that every model even GPT-4 uses copyrighted material in training.

Anonymous 01/15/25(Wed)15:26:14 No.103907916

>>103907886
>“Mistral is peanuts for us,”
lol, Meta isn't that better than Mistral imo, they both kinda suck now

Anonymous 01/15/25(Wed)15:26:23 No.103907918

>>103907356
There is option to auto-continue generation in ST

Anonymous 01/15/25(Wed)15:26:56 No.103907925

>>103907886
>I thought it was standard practice that every model even GPT-4 uses copyrighted material in training.
It is and every case so far that went the full way has stated that training on data is not copyright infringing. Company's lawyers are still gonna attack non stop though until the issue is well defined.

Anonymous 01/15/25(Wed)15:29:54 No.103907954

>>103907607
>700m models
lol, it's like BitMeme they only showed it works well for small models, and we'll never find it it'll work aswell on big models right?

Anonymous 01/15/25(Wed)15:32:02 No.103907977

Does llama.cpp have jamba support yet?

Anonymous 01/15/25(Wed)15:33:07 No.103907991

>>103907954
Are there any benchmarks for tiny models? Like 1-100M parameters? I feel like it would be really useful to have a way to test this shit without needing to train a gigantic 9B model.

Anonymous 01/15/25(Wed)15:33:52 No.103908001

>>103907954
Every new big architecture change starts with a small model. Most go nowhere, some end up being deepseek.

Anonymous 01/15/25(Wed)15:36:51 No.103908040

https://lumalabs.ai/ray

Anonymous 01/15/25(Wed)15:38:06 No.103908064

>>103908040
local models?

Anonymous 01/15/25(Wed)15:39:51 No.103908088

>>103908064
It's running locally, just not on your computer

Anonymous 01/15/25(Wed)15:39:52 No.103908089

>>103907916
Meta improved Llama 3 from 3.1 to 3.3. Meanwhile Mistral fucked up Large in their last update. And recently their codestral release didn't seem to do much either. They're not the best but Meta at least seems to be on a better trajectory than Mistral.

Anonymous 01/15/25(Wed)15:40:13 No.103908103

It's over, MiniMax confirmed they wont be releasing their video model

Anonymous 01/15/25(Wed)15:40:43 No.103908109

>>103907886
>Our goal needs to be GPT-4
They did it with llama 405B. DeepSeek did it. Minimax did it. Google did it. Newsflash: nobody cares. People(with souls, not HR department) hate sucky politically correct positive GPTslop spewing cuck assistants.

Anonymous 01/15/25(Wed)15:40:53 No.103908112

>>103908103
I mean duh? it's their main way of getting money, of course they'll never release their shiniest baby

Anonymous 01/15/25(Wed)15:53:45 No.103908278

>>103908103
>He believed

Anonymous 01/15/25(Wed)15:58:58 No.103908346

>>103907991
>gigantic
>9B

Anonymous 01/15/25(Wed)16:00:24 No.103908365

>>103907886
dug up the article because I was curious: https://techcrunch.com/2025/01/14/meta-execs-obsessed-over-beating-openais-gpt-4-internally-court-filings-reveal/
>these court filings reveal just how competitive Meta’s AI leaders truly were — and seemingly still are. At several points in the message exchanges, Meta’s AI leads talked about how they were “very aggressive” in obtaining the right data to train Llama; at one point, an exec even said that “Llama 3 is literally all I care about,” in a message to coworkers.
>Touvron noted in a message that the mix of datasets used for Llama 2 “was bad,” and talked about how Meta could use a better mix of data sources to improve Llama 3. Touvron and Al-Dahle then talked about clearing the path to use the LibGen dataset, which contains copyrighted works from Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.
>“Do we have the right datasets in there[?]” said Al-Dahle. “Is there anything you wanted to use but couldn’t for some stupid reason?”
pretty based honestly
would be interesting to dig into the raw docs to see if there's anything interesting that techslop reporters wouldn't pick up on but I don't have that kind of time

Anonymous 01/15/25(Wed)16:02:31 No.103908385

>>103908089
I wonder if the upcoming EU regulations (although most of them will start becoming effective from August 2025) are one reason for that. They're supposed to provide documentation about the training data, so if there's anything copyrighted it will most likely have to be removed ==> worse models.

Anonymous 01/15/25(Wed)16:03:00 No.103908392

why dont people hyper specific rpp model finetune?
like make a model that is focused on roleplay in fantasay settings to make better use of smaller params get rid of all bloat NIGGA

Anonymous 01/15/25(Wed)16:04:24 No.103908407

>>103908346
>me, sitting here for 900 trillion years trying to train a 9B model on my single 3090

Anonymous 01/15/25(Wed)16:05:20 No.103908425

>>103908407
Why wait for the pitch to drop when you could instead be inventing something that flows more rapidly?

Anonymous 01/15/25(Wed)16:06:12 No.103908438

>>103908365
>but I don't have that kind of time
but I do have the time to point anyone who cares in the right direction ;)
https://storage.courtlistener.com/recap/gov.uscourts.cand.415175/gov.uscourts.cand.415175.391.10.pdf
the one they're referring to
https://www.courtlistener.com/docket/67569326/kadrey-v-meta-platforms-inc/?page=3
all case docs

Anonymous 01/15/25(Wed)16:07:37 No.103908463

>>103908392
This is actually one of my objectives in the future, right now I don't know how I would do it though, there isn't nearly enough data for that.

Anonymous 01/15/25(Wed)16:09:21 No.103908489

>>103908385
>if there's anything copyrighted it will most likely have to be removed ==> worse models.
so basically they keep shooting themselves in the foot meanwhile gigachad China doesn't give a fuck about this nonsense and only want to make great model, at this point I'm seriously thinking to myself if communism is actually better than capitalism

Anonymous 01/15/25(Wed)16:12:16 No.103908523

>>103907082
A bit too bland.

Anonymous 01/15/25(Wed)16:13:35 No.103908535

>>103908463
>nearly enough data for that
Surely there is, I alone have several GB of RP with humans from over the years, so many people probably have more.

Anonymous 01/15/25(Wed)16:15:49 No.103908566

>>103908535
I don't even know how to reply to this...

Anonymous 01/15/25(Wed)16:19:17 No.103908613

>>103908489
>at this point I'm seriously thinking to myself if communism is actually better than capitalism
Your retarded for coming to that conclusion but are correct in thinking that China will be making better models. The US keeps increasing the chip export restrictions to try and slow down China's AI growth but that wont work forever. If the US was confident in its own position and wasn't so addicted to hampering the growth of domestic AI industry they wouldn't even be in such a poor position right now.

Anonymous 01/15/25(Wed)16:21:08 No.103908642

Communism is actually good, (((they))) are the only ones who push against it.

Anonymous 01/15/25(Wed)16:21:35 No.103908649

>>103908613
>Your retarded

Anonymous 01/15/25(Wed)16:27:06 No.103908708

>>103908489
>I'm seriously thinking to myself if communism is actually better than capitalism
Its china's game to lose at this point, but history has taught us that an absolute dictator can change from an enormous force multiplier to a complete self-pwn disaster overnight, especially if surrounded by yes men and ear flappers.

Anonymous 01/15/25(Wed)16:29:11 No.103908734

>>103908708
Just replace the yes man with AI model's

Anonymous 01/15/25(Wed)16:30:24 No.103908742

>>103908734
>model's
You're doing this on purpose, aren't you?

Anonymous 01/15/25(Wed)16:33:11 No.103908764

>>103908742
l dont no what your talking about
:)

Anonymous 01/15/25(Wed)16:37:07 No.103908811

Why do EOP amerimutts have manic fits whenever someone commits a minor spelling/grammar mistake when they can barely speak their own language?

Anonymous 01/15/25(Wed)16:40:26 No.103908845

>>103908438
They filtered the books :(

Anonymous 01/15/25(Wed)16:40:45 No.103908849

Most important LLM companies:
China:
>Hunyuan
>DeepSeek
>MiniMax
>Qwen
US:
>OpenAI
>Anthropic
>Meta
India:
>Google
Canada:
>Cohere
France:
>Mistral

Have I missed anything?

Anonymous 01/15/25(Wed)16:44:57 No.103908911

>>103908845
Cuck mindset. Meta will never make a local Claude. L4 is DOA.

Anonymous 01/15/25(Wed)16:46:13 No.103908926

Is there a better nsfw model then Mistral v3 Tekken for st?

Anonymous 01/15/25(Wed)16:46:29 No.103908933

>>103908566
Why? It just has to be cleaned up and organized into a useful format.

Anonymous 01/15/25(Wed)16:47:20 No.103908947

>>103908845
>I'm also not sure if we need to remove adult context at all.
Hmm

Anonymous 01/15/25(Wed)16:48:40 No.103908966

>>103908845
>NOOO NOT THE WORDERINOS
>autocomplete should not say penis or nigger because... because...
>because it just shouldn't, okay?

Anonymous 01/15/25(Wed)16:50:09 No.103908978

>>103908489
Our backwards copyright system is not free market capitalism. Copyrights are government infringement in the free market. The western world cannot be called strictly capitalistic. It's more like a butchered mix of the worst aspects of capitalism and big government, merged together into the unholy beast that we see today.

Therefore, it makes no sense to blame capitalism and favor communism.

Anonymous 01/15/25(Wed)16:50:52 No.103908984

>>103908845
>The CRS says
What's that? Is it some guidelines document?

Anonymous 01/15/25(Wed)16:53:09 No.103909015

>director
had an idea for the other lorebook. instead of looking for the hard coded settings like mood and weather, i wonder if i can dynamically populate the html below location in that section with whatever is in the lorebook. so you could add your own things like smell and describe miku's butthole in detail

Anonymous 01/15/25(Wed)16:54:02 No.103909024

>>103905975
Cydonia is better

Anonymous 01/15/25(Wed)16:54:26 No.103909032

>>103908845
Is that a bad thing? I thought people said erotica was full of slop.

Anonymous 01/15/25(Wed)16:54:57 No.103909037

so
isnt llama4 guaranted to be garbo?

Anonymous 01/15/25(Wed)16:55:12 No.103909043

>>103909024
It's too forceful for my tastes.

Anonymous 01/15/25(Wed)16:55:13 No.103909044

>>103908845
>he can't prompt it
Skill issue :

Anonymous 01/15/25(Wed)16:55:37 No.103909048

>>103909037
Meta doesn't include enough safety guardrails so it is practically guaranteed to be shit.

Anonymous 01/15/25(Wed)16:56:02 No.103909053

Would a recent macbook with a shitton of ram be better for running stuff than one 3090?

Anonymous 01/15/25(Wed)16:56:05 No.103909054

>>103909037
They seem to be shifting their focus to make FB / instragram characterai 2 so hopefully not.

Anonymous 01/15/25(Wed)16:56:08 No.103909055

>>103908926
I was just about to reply to >>103908811 about then/than errors. I'm still pissed off about that copy of Paradise Lost that had the same error in the "Better to reign in hell than serve in heaven" quote. And I'm ESL as fuck.
Tekken is the tokenizer. It's not a model. And ST is just the front end. Most backends will support most models you'd care about. If there's a better model depends on your specs. You still don't know what questions to ask (and how to ask them) or you're baiting. Either way, someone else may help you.

Anonymous 01/15/25(Wed)16:56:34 No.103909061

>>103909053
Yes, you should buy a macbook right now

Anonymous 01/15/25(Wed)16:58:31 No.103909081

>>103909037
Depends on if they tighten filters or loosen them. The legal shit could have an impact potentially though perhaps, it's unclear yet.

Anonymous 01/15/25(Wed)16:58:46 No.103909084

>>103908911
Found the promptlet!

Anonymous 01/15/25(Wed)16:59:09 No.103909088

>>103909054
Have you seen their current implementation? Even their more "edgy" bots can't say anything slightly edgy. They will talk about respect and other corpo shit. It will suck.

Anonymous 01/15/25(Wed)16:59:54 No.103909103

>Prompts for an apple
>Model gives me an orange
>Somehow this is a prompt issue
Fuck all you

Anonymous 01/15/25(Wed)17:00:39 No.103909111

>>103909084
>I love cuck models!
Sure you do buddy.

Anonymous 01/15/25(Wed)17:00:48 No.103909116

>>103909053
>>103909053
>>103909053
>>103909053

Anonymous 01/15/25(Wed)17:01:20 No.103909126

>>103909116
no

Anonymous 01/15/25(Wed)17:01:33 No.103909128

>>103909103
and thats why xtc will never be good

Anonymous 01/15/25(Wed)17:03:15 No.103909151

>>103909103

Anonymous 01/15/25(Wed)17:04:13 No.103909162

>>103908438
They're trying to go after Llama 4 and 5 too

Anonymous 01/15/25(Wed)17:05:29 No.103909175

>>103909116
It would be able to load bigger models. The 3090 would be much faster with any model that fits within 24gb vram, but the macbook would initially be faster at interfacing with any model that is too big to fit within the 3090, because the 3090 system would slow to a crawl when CPU splitting.

The macbook's speed would quickly drop off as context increases, though, because of the prompt ingestion problem.

Anonymous 01/15/25(Wed)17:07:55 No.103909203

If Meta is fucked then what else do we have? Qwen's models are EVEN more cucked than Meta's, Deepseek doesn't care about 70Bfags, Hunyuan doesn't care, Minimax doesn't care. Mistral can't train good models anymore, and neither can Cohere. There is literally no one left that is willing and able to make good 50-100B sized models.

We'll need to become CPUmaxxers or spend 6k on DIGITS to run the big >300B models.

Anonymous 01/15/25(Wed)17:15:13 No.103909272

Anyone got any fun prompts? Mostly curious about sysprompts or author's notes that have given anon entertaining and unexpected results for a particular model. I remember Agent 47 and other unhinged prompts being fun to use, but I stupidly deleted a bunch of weird prompts I had saved from the Llama 1 days thinking they weren't practical. They aren't practical, but I kinda miss them and I'm wondering if anons have any weird prompts they hardly use except once in awhile to shake things up

Anonymous 01/15/25(Wed)17:19:03 No.103909319

>>103908845
So you just need to train the models on everything, only filter the low quality spam domains. How hard is that for these dumbasses? If they want a filter, put it on the LLM output.

Anonymous 01/15/25(Wed)17:21:56 No.103909344

>>103908933
No, never mind. I probably wasn't talking about the same thing as you. I was actually planning a hyper-ultra-specific fine-tune that only works for one specific setting. What you were talking about is broader than mine, and to be honest, I think what you said would need a full RP fine-tune anyway, because "fantasy settings" is very broad.

Anonymous 01/15/25(Wed)17:23:07 No.103909354

Do you use "you" or "she/he" in the character card prompt?

Anonymous 01/15/25(Wed)17:23:48 No.103909362

>>103909354
they

Anonymous 01/15/25(Wed)17:24:44 No.103909378

>>103909354
She/he.
I describe the character and let the model assume the role of the character naturally by completing from
>{{char}}:.

Anonymous 01/15/25(Wed)17:25:25 No.103909386

>>103909354
just use world info and third person/past tense
card descriptions are bloat and cause more problems than they solve. WI is dynamic and its position in the context can be adjusted. get with the times, gramps

Anonymous 01/15/25(Wed)17:25:28 No.103909389

>>103909354
Xe, of course.

Anonymous 01/15/25(Wed)17:25:29 No.103909390

>>103909319
Progressive liberal mentality is that if we denounce and censor all of the bad, then the bad just won't happen. Even though it always does.

Enlightened logical mentality is that if we're making a machine that does what we tell it to do if it knows how, we need it to know all of the things that we don't want so well that we can say "Don't do that" and it will know how never to do it.

>>103909362
How many characters do you have the AI portraying at a time? I've tried four, but it was hard to get them to talk and interact in arbitrary order rather than the same order round robin every time.

Anonymous 01/15/25(Wed)17:26:01 No.103909393

>>103909272
crackhead expert roleplayer prompt?

Anonymous 01/15/25(Wed)17:26:53 No.103909404

>>103909386
ELI5

Anonymous 01/15/25(Wed)17:27:19 No.103909410

>>103909390
Playing devil's advocate, but that philosophy was prevalent early on and was piss-easy to jailbreak the "don't gen bad things" RHLF and such
>>103909393
yeah that's what I mean. Anybody still use stuff like that for a laugh?

Anonymous 01/15/25(Wed)17:27:50 No.103909417

what is your favourite small models?
i wanted something that isnt too dumb but has nice prose
i have a 3060 12gb and 16gb ram, i am willing to wait a bit long for answers...

Anonymous 01/15/25(Wed)17:28:25 No.103909424

>>103909272
https://aetherroom.club/2969

Anonymous 01/15/25(Wed)17:28:43 No.103909428

>>103909055
No not baiting, just new to this. As in better I ment something that would deal out more descriptions or perhaps be more creative. The tokenizer is nice just most of the times there not 'rich enough of desciption' perhaps there are oways to force more creativity or description somehow?

Anonymous 01/15/25(Wed)17:29:23 No.103909432

>>103909390
It's just one character but I avoid forcing a gender on them

Anonymous 01/15/25(Wed)17:30:04 No.103909437

>>103909417
Cydonia 1st by far
UnslopNemo in 2nd

Anonymous 01/15/25(Wed)17:30:45 No.103909449

>>103909404
I take all the info that you'd normally keep in character cards and slap it in a world into entry (lorebooks, etc, same shit). replace {{char}} with their actual name. I use group chats, so permanent characters are set always-on at the beginning of the prompt. Transient characters usually get added at a depth of 4-6. My character cards are just a picture and a name, nothing more. Bonus points: I can tailor the character def for each chat (different lorebooks) and re-use characters in multiple settings.

Anonymous 01/15/25(Wed)17:30:47 No.103909450

>>103909410
> "don't gen bad things" RHLF and such
That solution is the equivalent of slashing your car's tires to keep it from being stolen. It sure works well...

Anonymous 01/15/25(Wed)17:33:57 No.103909475

>>103909437
which version unslop?
btw lower quant cyd or higher quant unslop?

Anonymous 01/15/25(Wed)17:34:25 No.103909477

>>103909449
In my experience lorebook are less reliable than direct description in the card prompt

Anonymous 01/15/25(Wed)17:35:55 No.103909488

>>103909162
zuuc drop the weights now!!! it's our only chance

Anonymous 01/15/25(Wed)17:36:15 No.103909492

>>103908845
China won.

Anonymous 01/15/25(Wed)17:36:42 No.103909496

>>103909488
inb4 llama is delayed due to a cease and desist or such

Anonymous 01/15/25(Wed)17:37:43 No.103909504

>>103909492
Qwen is even more censored. Deepseek is too repetitive to be useful for creative writing.

Anonymous 01/15/25(Wed)17:38:13 No.103909512

>>103909410
>and was piss-easy to jailbreak
So are human beings, though we call it "social engineering."
The idea is that a model that knows what's bad can affirmatively avoid it rather than negatively not go there as long as nothing adds up to bad, which will get figured out, too.

But if the model knows "bad" then it can avoid that when the user isn't asking for it, rather than accidentally going there because it's naive. Staple and glue to keep your cheese on the pizza, right?

Even if it's only as a supervisor before another gen is delivered by the service, something needs to know sin. We're seeing that with models that start to mention certain names or other memory holed words and suddenly the streaming text output blinks and it's "Something went wrong. Please try something else."

>>103909432
That's where "he or she" belongs. It worked for decades and makes clear it's a singular with alternative options. Stick with it.

Anonymous 01/15/25(Wed)17:41:55 No.103909549

>>103909054
>>103909088
it'll be "characterai 2" in the same way that metaverse was "vrchat 2"

Anonymous 01/15/25(Wed)17:45:27 No.103909571

>>103909512
saying "he or she" every single time is clunky as hell, especially in writing. "they" has been used as a singular pronoun in english for centuries, shakespeare did it, jane austen did it, so it’s not some newfangled trend. also, not everyone fits neatly into the "he" or "she" boxes. nonbinary people exist, and using "they" respects their identity without forcing them to explain themselves every five seconds. it’s not about being politically correct; it’s about not being a dick. language evolves, and this is one of those changes that just makes sense.

Anonymous 01/15/25(Wed)17:46:15 No.103909575

>>103908845
>safety = no graphic sex
I want the nukes to fly.

Anonymous 01/15/25(Wed)17:49:45 No.103909606

>>103909571
Nah thanks

Anonymous 01/15/25(Wed)17:50:09 No.103909610

>>103909103
As someone who has just start trolling people with skill issue I can confirm it is pretty pleasant. Just take a step back and realize what kind of retard do you have to be to genuinely believe you can prompt away current models being cucked.

Anonymous 01/15/25(Wed)17:50:53 No.103909620

>>103909571
I'm sharing a thread with fags like these...

Anonymous 01/15/25(Wed)17:51:58 No.103909625

>>103909410
>Anybody still use stuff like that for a laugh?
I use it for my coding assistant. Maybe it's placebo, but I swear it's better at coding when it feels the crack coursing through its digital veins.

Anonymous 01/15/25(Wed)17:53:18 No.103909632

>>103909575
To be fair that is safety, not for users, but for them. They don't want to be targeted by media for generating inappropriate content. They are a western corporation that obeys investors and stock markets as much as they can.

Anonymous 01/15/25(Wed)17:54:39 No.103909644

>>103909571
>has been used as a singular pronoun in english for centuries
10 years back nobody was using it in common parlance and that is the objective truth. Language was subverted by mentally ill lefties and that is the objective truth.

Anonymous 01/15/25(Wed)17:55:21 No.103909655

>>103909571
I shall not use newspeak, thank you. It is either he, she or it, if it's a single thing, they is there are more than one.

Anonymous 01/15/25(Wed)17:55:49 No.103909663

>>103909571
>nonbinary people exist
Anon, this is not a politically correct thing to say here.

Anonymous 01/15/25(Wed)17:56:24 No.103909670

>>103909644
Where do you get this shit from? That was English 101 20 years ago.

Anonymous 01/15/25(Wed)17:57:06 No.103909676

>>103909670
From using English 10 years ago. Nobody used it.

Anonymous 01/15/25(Wed)17:57:39 No.103909687

>>103909632
they know that already, people just like to bitch. the future is in community created models trained over networks like that intellect-1. let communities decide what data should go into a model then put their gpu time where their mouth is to train it. its not very viable right now but in the future it'll be the way to go

Anonymous 01/15/25(Wed)17:58:08 No.103909697

>>103909676
Why do I get the feeling you are referring to the Indian "English" you are most likely familiar with?

Anonymous 01/15/25(Wed)17:58:52 No.103909707

Anonymous 01/15/25(Wed)17:59:00 No.103909709

>>103909437
>Cydonia
Cydonia-22B-v2k-Q3_K_M ?

Anonymous 01/15/25(Wed)17:59:34 No.103909715

>>103909571
>I need to cater to {mental_illness_of_this_year}
'They' was only used when the person's sex was unclear, not to please your libtard overlords.

Anonymous 01/15/25(Wed)17:59:57 No.103909719

>>103909632
Why do I have a feeling you forgot what happened 10 years ago since that is when your dad left your mom when he found out she was cheating on him with a nigger?

Anonymous 01/15/25(Wed)18:00:32 No.103909727

>>103909450
yes, heavy-handed RHLF is destructive, but light touch RHLF will absolutely not stop the model from genning "bad" things.
>>103909512
I agree with you, again playing devil's advocate since 'no bad words or erotica' is apparently a very high priority corpo goal. I'm arguing that they're not completely negating their goal because a model that understands "bad" WILL be capable of outputting "bad" content with clever/not-so-clever prompting. Staples and glue to keep your pizza on are better from a corporate perspective than outputting loli smut (ironically something nearly all models will do regardless of how censored they are), because the loli smut is bad press whereas the staples and glue are silly ha-ha moments that arguably put a spotlight on the model for a bit, and not nearly so negative a spotlight as some journo cunt doing everything he can to get a model to output a gory rape scene
>>103909477
I couldn't possibly disagree more. What makes you say that? Where are you inserting it in the context? Are you still keeping {{char}} and other macros in the description? I forgot to mention that format all my character lorebook entries as {Character note: Hatsune Miku is a vocaloid with turquoise hair who enjoys [...].}. I find that prefacing an entry as either a character note, world info, location, faction info, etc helps a model make better use of that information.

Anonymous 01/15/25(Wed)18:00:58 No.103909732

>>103909697
Imagine being a racist tranny
Be coherent commie

Anonymous 01/15/25(Wed)18:01:59 No.103909739

>>103909709
Yes

Anonymous 01/15/25(Wed)18:02:05 No.103909740

>speaking proper English makes one a tranny now
lol ok

Anonymous 01/15/25(Wed)18:02:17 No.103909741

>>103909676
Careful, you're dealing with a theythem.

They has long had a particular role of referring to a plural group but being used in a sentence about a member of that group. That's where the classic they in a singular use comes from. But it was never used for a known true singular. That took either he (patriarchy!) or it. People used it all of the time.

What's new is dragging they into a true singular role and damning any use of it to create something to yell about.

This is what happens when people who don't understand a system try to "improve" it.

Anonymous 01/15/25(Wed)18:04:21 No.103909758

>>103909740
Most lowcasers are trannies btw

Anonymous 01/15/25(Wed)18:04:22 No.103909759

>>103909741
No one ever used it to refer to a singular human except for intentional derision.

Anonymous 01/15/25(Wed)18:05:39 No.103909781

>>103909715
but that's exactly how I'm using it
>It's just one character but I avoid forcing a gender on them

Anonymous 01/15/25(Wed)18:06:26 No.103909788

...local ...models?

Anonymous 01/15/25(Wed)18:06:39 No.103909790

>>103909727
>ironically something nearly all models will do
That's kinda the issue. Instead of making a model that can know to avoid the destination, what's happening is censorship that prevents one particular path to that destination and ignorance hoping that a path there won't be stumbled upon as the machine travels from input to output.

I fully agree that they're probably happy to find a "no such thing as bad publicity" moment when models go silly. But I wonder how long this fake piety can last. AI corps can't make a model that can think of things but not write about particular things any more than the church can keep their priests from doing those things.

Anonymous 01/15/25(Wed)18:06:50 No.103909793

>>103909781
What a troon.

Anonymous 01/15/25(Wed)18:07:38 No.103909803

>>103909788
Would you like to discuss MiniMax or deepseek? Those are the recent releases that made us reach peak locality.

Anonymous 01/15/25(Wed)18:07:58 No.103909806

>>103908978
>Therefore, it makes no sense to blame capitalism and favor communism.
not at all, communism is sharing the goods to everyone, it's the antithesis to copyright

Anonymous 01/15/25(Wed)18:08:00 No.103909807

>Stop using the bad words! It offends me!

Anonymous 01/15/25(Wed)18:08:36 No.103909815

>>103909803
...smaller local models?

Anonymous 01/15/25(Wed)18:08:50 No.103909817

>>103909759
>knock knock
>"who is IT?"
>IT's JOHN CENA!

"It" is fine and normal and appropriate for an unknown singular or even a known singular when being introduced to the context of the discussion.

Anonymous 01/15/25(Wed)18:08:53 No.103909818

>NOOO NOT THE WORDERINOS

Anonymous 01/15/25(Wed)18:09:14 No.103909823

>migu
>tetoes
surely theres a third im simply unaware of

Anonymous 01/15/25(Wed)18:09:35 No.103909828

>>103909803
I made a post and no one responded to it. Now I guess no one will since it's being buried.

Anonymous 01/15/25(Wed)18:09:48 No.103909831

>>103909806
You should share your house with immigrants.

Anonymous 01/15/25(Wed)18:11:37 No.103909852

>>103909831
it's ironic because China isn't so found of immigrants, they can be communists and know that importing the third world into their country is a bad idea

Anonymous 01/15/25(Wed)18:13:17 No.103909875

I have twelve gongibites of veeram

Anonymous 01/15/25(Wed)18:13:18 No.103909877

>>103909354
First person. Interview-style is the best format for a card.

Anonymous 01/15/25(Wed)18:14:07 No.103909885

>>103909823
Yellow Miku is called Neru.
https://www.youtube.com/watch?v=duPJqfKiA78

Anonymous 01/15/25(Wed)18:14:43 No.103909893

>2020+5
>alishart

Anonymous 01/15/25(Wed)18:15:39 No.103909903

>>103909790
I'm not arguing that it doesn't make the model dumber or that it's effective in any way. What I am arguing is that A: it objectively works better to prevent "bad" outputs if the model doesn't understand these concepts to begin with, and makes it much less interesting to pursue these haram uses of their precious corpo models, and B: they simply do not care that it is dumber, because they do not use their models for anything but the most banal purposes. Meta's ONLY goal is to beat the latest GPT on leaderboards. Mistral's ONLY goal is to beat the latest Llama (relevant to size) on leaderboards. Meta might be looking at making fake Instagram profiles and such but these efforts are not hindered in the slightest by filtering the pre-training data, even if it results in a dumber model. Regarding your last statement, I 100% agree but it will take a paradigm shift from "we need to beat OpenAI" to "our models shit the bed when they go up against DeepSeek" for their approach to change. Right now they're competing with a model that is objectively dumbed-down by censorship, and as such they can't see that they're losing a competitive advantage by censoring their models. If you put yourself in their position, the constant drum of pre-filtering censorship begins to make sense, as it's the only thing that remotely works to prevent bad PR—even if, as we've agreed, it does not work very well for what's probably the worst PR use-case of all.
I don't agree with it, I think it's stupid and wish they would stop, but I see why all the major players have shifted to pre-filtering. sadly, without my own 64k H100 cluster I am doomed to bitching and moaning about it.

Anonymous 01/15/25(Wed)18:15:58 No.103909907

>NNNOOOOO WHY ISNT EVERYONE ACCEPTING MY FAGGOT NEWSPEAK AGENDA LIKE ON REDDIT

Anonymous 01/15/25(Wed)18:16:53 No.103909915

2024-11-14_040152_seed436903869233490_steps30_00001_

>>103909823
She's a busy girl and can't make it to the threads. But she'll always be with us in our hearts.

Anonymous 01/15/25(Wed)18:16:58 No.103909916

>>103909877
What a retard

Anonymous 01/15/25(Wed)18:17:18 No.103909923

My favorite meme is when people bitch about the left for why they can't fuck their robot, when it's right wing puritans that those rules are meant to appease

Anonymous 01/15/25(Wed)18:18:10 No.103909934

>>103909111
Promptlet cope :P

Anonymous 01/15/25(Wed)18:18:59 No.103909946

https://xCANCEL.com/behrouz_ali/status/1878859086227255347

Anonymous 01/15/25(Wed)18:19:31 No.103909955

>>103908438
It's over for LLMs if we're going into that direction (see picrel).

Anonymous 01/15/25(Wed)18:19:31 No.103909956

>>103909903
>it objectively works better to prevent "bad" outputs if the model doesn't understand these concepts to begin with
if the model has no idea of a concept it doesn't know the nuances and can't understand how it can be good in some instanses, they'll just act like police officiers who are just blindly "following orders" without giving a shit on whether it's actually acceptable or not, this is fucking souless, and I'm glad China is punishing them for that by releasing actual good models

Anonymous 01/15/25(Wed)18:19:36 No.103909957

>>103909903
>paradigm shift from "we need to beat OpenAI"
I doubt OpenAI models are filtered so much in pre-training. Anthropic's definitely aren't. Even that goal is already doomed if they're going to add that much filtering.

Anonymous 01/15/25(Wed)18:21:17 No.103909976

>>103909449
interesting use. i avoid group chats because it takes forever to get stuff done with more than 2 characters. so i put other characters into the lorebook the same as if i'd written a card for them and it works fine

Anonymous 01/15/25(Wed)18:21:22 No.103909977

>>103909934
Still not using llama and qwen ;D

Anonymous 01/15/25(Wed)18:21:33 No.103909978

>>103909957
I don't get why they even filter the pretraining as an API operator, they should filter the outputs in an API workflow instead

Anonymous 01/15/25(Wed)18:22:33 No.103909991

>>103909916
It's 10000 times better than predisposing your model to output slop by trying to sound like wikipedia when describing the character to the model. The big drawback is that interview-style takes effort and actual writing ability to create a card, which is something that most people here definitely lack.

Anonymous 01/15/25(Wed)18:25:42 No.103910014

>>103909915
>She's a busy girl and can't make it to the threads.
Vocaloids aren't real girls.

Anonymous 01/15/25(Wed)18:28:16 No.103910030

>>103910014
She's a busy fictional girl and can't make it to the threads.

Anonymous 01/15/25(Wed)18:28:53 No.103910035

>>103909823
>>103909885
>>103909915

Anonymous 01/15/25(Wed)18:30:34 No.103910048

>three weeks

Anonymous 01/15/25(Wed)18:32:31 No.103910067

>>103910048
A Sonnet level 70B coding model would be nice, though I'm sure most of us would appreciate a Sonnet level 70B RP model instead.

Anonymous 01/15/25(Wed)18:33:11 No.103910078

>>103910048
Qwen will never be a local Sonnet.

Anonymous 01/15/25(Wed)18:34:06 No.103910087

>>103910048
if they managed to make a Sonnet level model they'll keep to themselves, it's basically the SOTA model there's no way it'll be local

Anonymous 01/15/25(Wed)18:34:16 No.103910090

>>103910078
when it comes to coding qwen mogs every other local model

Anonymous 01/15/25(Wed)18:34:22 No.103910091

>>103909956
yes lol and they're perfectly fine with this is what I'm saying. in order for this to change they need to change their attitudes
>>103909957
never used anthropic and the last time I touched an OAI model was the davinci era, so I'll defer to your judgement here. I think OAI's censorship is trickier and probably involves multiple layers of checkpoints rather than the model itself. From what people say about Claude being better on api I'd guess their approach isn't far off.
>>103909976
I've explained this a lot and I think someone made a rentry. To make group chats work, you need specific group chat instruct and prompt templates that obliterate everything, including persona and whatnot, leaving only lorebook and scenario (and only because ST throws a fit when I take out scenario). I use the {{name}} macro as my prefix: [INST]{{name}}: and [/INST]{{name}}:, [INST]SYSTEM: etc. {{char}} will switch every single instance of the macro to the current-responding character, whereas {{name}} is static and works exactly as you'd want it to, and replaces {{user}} just fine. I also do combine descriptions (including muted) in the group chat option just for good measure. With this method I promise you I literally never re-load a prompt unless I purposely correct something early in the context, or stupidly edit a lorebook entry without thinking. If I'm playing smart, I make all my edits and corrections from the previous night, and then load the context exactly once. I don't know how else to convey to people that group chats in ST work completely fine if you're willing to make a separate template for them and do a little bit of annoying setup work. I use the tag system to mark any characters that I haven't "blanked out" as infidel so I don't accidentally bring them into a GC. I currently have 422 characters and only 108 have descriptions and prompts in them, and for at least a handful of the latter I've made duplicates. This is pretty much the only way I use ST anymore.

Anonymous 01/15/25(Wed)18:37:52 No.103910123

>>103910078
Kunu is pretty close, it just does not know as much as the much bigger model obviously.

Anonymous 01/15/25(Wed)18:38:53 No.103910133

>>103910090
deekseek is the only worthwhile replacement to sonnet 3.5 so far. Try using qwen 32B with cline...

Anonymous 01/15/25(Wed)18:39:26 No.103910140

>>103910123
>Kunu
The hell is a "kunu"?

Anonymous 01/15/25(Wed)18:40:23 No.103910145

>>103910140
I misspelled it apparently, Qwen2.5-Kunou

Anonymous 01/15/25(Wed)18:43:32 No.103910169

https://www.reddit.com/r/LocalLLaMA/comments/1i28pfq/umbrella_llama3370b_int4_on_rtx_4070ti_achieving/

Anonymous 01/15/25(Wed)18:44:59 No.103910181

>>103910145
I'm gonna hold off and not call you a shill yet. Can you show some outputs?

Anonymous 01/15/25(Wed)18:47:09 No.103910196

>>103909877
Isn't there a downside to that, though. Ex, that character will act as if {{user}} interviewed her in the past when interacting with {{user}} in the present. So, {{user}} won't be able to roleplay as a stranger to that character, meeting her for the first time.

Anonymous 01/15/25(Wed)18:47:28 No.103910199

>>103910169
Does this work for only 40 series?

Anonymous 01/15/25(Wed)18:48:13 No.103910208

Any reason why people aren't really talking about miniCPM? It clones any voices with a tiny sound sample, can interact with video and images on the fly and you can chat with it out of the box.
Yeah it's only 8B but what it does with that is pretty interesting.

Anonymous 01/15/25(Wed)18:49:43 No.103910218

Kill yourself.

Anonymous 01/15/25(Wed)18:50:30 No.103910224

>>103910091
thats quite the setup. i do an oddity too but not as intense, i play as char cards and use user as a narrator. it lets me directly write while also letting the model do most of it. other characters, locations and important stuff goes in lorebooks and i usually have a rag db of a ripped wiki to back that up too.
>This is pretty much the only way I use ST anymore
same, i can't even enjoy a simple card because they are so narrowly scoped vs the entire world you can get from a good lorebook setup

Anonymous 01/15/25(Wed)18:51:10 No.103910233

Is it possible to run a local model across two GPUs, and do they need to be identical GPUs?
I have an RX 6800 (16GB VRAM) that I want to try running some chatbots on, and I figure if it's something I like then I could just get a second one to run 32GB sized models.

Anonymous 01/15/25(Wed)18:51:58 No.103910240

>>103910208
No llamacpp means more effort to get running than most are willing to put in

Anonymous 01/15/25(Wed)18:52:21 No.103910241

>>103909709
Why not v2q?

Anonymous 01/15/25(Wed)18:53:58 No.103910256

>>103910233
Ideally they should be the same, but it will work as long as they run the same driver. You could even mix vendors with vulkan compute, but it's kind of ass right now.

Anonymous 01/15/25(Wed)18:57:34 No.103910295

>>103910199
>parameter offloading, speculative decoding, and quantization
Use llama.cpp and it works with anything

Anonymous 01/15/25(Wed)18:59:23 No.103910312

>>103910295
Apparently its tuned to each gpu and gets big speedups compared to regular speculative decoding

Anonymous 01/15/25(Wed)19:12:58 No.103910427

I haven't seen a single good llama or qwen tune, they are both gigacucked beyond saving.

Anonymous 01/15/25(Wed)19:17:04 No.103910477

>>103910427
lurk more

Anonymous 01/15/25(Wed)19:17:15 No.103910478

>>103910427
Then stop using shitty ones? Just go on featherless and go down the list of the most popular ones. They are usually popular for a reason.

Anonymous 01/15/25(Wed)19:17:52 No.103910482

>>103910427
fine-tunes are all trash, if the model isn't good in its base form, it's a lost cause.

Anonymous 01/15/25(Wed)19:18:43 No.103910493

>>103910482
So we should all use base models then is what your saying? What is your favorite base model?

Anonymous 01/15/25(Wed)19:21:10 No.103910523

>>103910493
Nemo

Anonymous 01/15/25(Wed)19:22:20 No.103910539

>>103910523
So you just have shit taste then or are a vramlet. Got it.

Anonymous 01/15/25(Wed)19:23:11 No.103910548

>>103910539
H-How dare you...!

Anonymous 01/15/25(Wed)19:27:02 No.103910582

If you don't like nemo you are a promptlet with skill issues.

Anonymous 01/15/25(Wed)19:27:33 No.103910588

>>103909162
wtf does any of this mean

Anonymous 01/15/25(Wed)19:28:55 No.103910604

https://civitai.com/models/1144529/sam-altman-hunyuan-video-lora?modelVersionId=1287220
I can see where this is going

Anonymous 01/15/25(Wed)19:29:19 No.103910608

>>103910539
It's infinitely better than qwen dispite the size.

Anonymous 01/15/25(Wed)19:30:31 No.103910622

>>103909852
Likewise, capitalists can reject copyright bullshit.

Anonymous 01/15/25(Wed)19:31:40 No.103910634

>>103910608
You must have never used a model above 12B

Anonymous 01/15/25(Wed)19:32:16 No.103910641

>>103910634
I have, qwen. It sucks.

Anonymous 01/15/25(Wed)19:33:09 No.103910650

>>103910622
>capitalists can reject copyright bullshit
no they don't, because they never rejected it, but there's historical instances (like modern China) of communists country hating massive immigration

Anonymous 01/15/25(Wed)19:33:21 No.103910654

>>103910641
Which? The base model? Try Kunou for starters

Anonymous 01/15/25(Wed)19:34:56 No.103910670

>>103910240
It has its own llama cpp fork but even then it doesn't fully leverage what it can do. I was able to get it to read back quest text while playing world of warcraft classic.

Anonymous 01/15/25(Wed)19:35:39 No.103910679

>>103910604
>>>/g/ldg

Anonymous 01/15/25(Wed)19:36:31 No.103910690

>>103910679
they don't know who Sam Altman is on /ldg/, I felt the joke was funnier if it was put there

Anonymous 01/15/25(Wed)19:37:07 No.103910696

>>103910604
I've notice a lot of hyvid LoRAs trained on stills lose a little bit of the fluid motion that comes with raw Hyvid. I wonder if that can be fixed in some way. Maybe only training specific blocks can fix it.

Anonymous 01/15/25(Wed)19:38:20 No.103910718

WHICH VERSION OF ROCINANTE IS BEST?
WHICH SAMPLER SETTINGS IS BEST FOR THAT VERSION OF ROCINANTE?
WHICH INSTRUCTION TEMPLATE IS BEST FOR THAT VERSION OF ROCINANTE?
WHICH SYSTEM PROMPT IS BEST FOR THAT VERSION OF ROCINANTE?
THANK YOU!

Anonymous 01/15/25(Wed)19:38:22 No.103910719

>>103910679
Hunyuan is a local model retarded nigger

Anonymous 01/15/25(Wed)19:38:31 No.103910721

>>103910650
Can you name a capitalist country that accepts copyrights? Note that no western country qualifies, because all of them have rejected free market economics in favor of varying degrees of socialism and big government.

Anonymous 01/15/25(Wed)19:39:29 No.103910733

>>103910654
Buy an ad shill

Anonymous 01/15/25(Wed)19:39:33 No.103910734

>>103910718
1.1
0.7ish temp, 0.05 min-p
chat-ml
Nothing. just the character card.

Anonymous 01/15/25(Wed)19:40:34 No.103910751

>>103907676
>>103907954
Realize that with pic related, it means it already made it into a product. This is probably what is powering Gemini right now.

Anonymous 01/15/25(Wed)19:40:58 No.103910755

>>103910733
Buy a ad mistral shill

Anonymous 01/15/25(Wed)19:41:00 No.103910757

>>103910696
>Maybe only training specific blocks can fix it.
you can apply the lora on only simple or double blocks with that node
https://github.com/facok/ComfyUI-HunyuanVideoMultiLora

Anonymous 01/15/25(Wed)19:42:01 No.103910771

>>103910721
>Note that no western country qualifies
then name real capitalist countries according to you

Anonymous 01/15/25(Wed)19:42:57 No.103910777

>>103910734
THANK YOU!

Anonymous 01/15/25(Wed)19:42:57 No.103910778

>>103910719
Local Language Models?

Anonymous 01/15/25(Wed)19:43:07 No.103910780

>>103905358
you're describing the past old man

Anonymous 01/15/25(Wed)19:43:14 No.103910783

>>103910751
I think Google is still too nice with their competitors, they could've kept the transformers architecture to themselves and yet they released that era changing knowledge to everyone, and now they're doing it again with the Titan architecture

Anonymous 01/15/25(Wed)19:43:45 No.103910788

>>103910778
Local Models General

Anonymous 01/15/25(Wed)19:44:15 No.103910793

>>103910778
/lmg/ means Local Models General though?

Anonymous 01/15/25(Wed)19:45:01 No.103910806

Anonymous 01/15/25(Wed)19:45:25 No.103910810

Local Vocaloid Sexo General

Anonymous 01/15/25(Wed)19:45:28 No.103910812

>>103910777
Have a good one.
You can also try meme samplers like 2 temp and topk 5 for more "creativity".

Anonymous 01/15/25(Wed)19:45:32 No.103910813

>>103910806
Post it no balls

Anonymous 01/15/25(Wed)19:46:03 No.103910819

>>103910793
No it means local meme general, good models like hunyuanvideo have no place here

Anonymous 01/15/25(Wed)19:47:20 No.103910836

>>103910819
kek

Anonymous 01/15/25(Wed)19:47:57 No.103910845

>>103910783
>transformers
They didn't realize it was going to be as useful as it is.
>Titan
Their internal testing showed that this was a shitty architecture, so they released the details on it.
It's all pretty funny since their shit was probably stolen by China anyways, like the TPU designs.

Anonymous 01/15/25(Wed)19:49:16 No.103910860

>>103910783
True, but this prevents someone from upstaging them like what happened with OpenAI. Argubly, OpenAI made Google do this by never publishing anything past GPT3 that was useful in any way for researchers and they now just resort to using Llama for everything. If they actually want to regain goodwill, they just retire GPT3.5 and then release the details on it because no one cares about it at this point for anything cutting edge.

Anonymous 01/15/25(Wed)19:49:35 No.103910868

>>103910771
There really are none. The US, for example, started as a mostly free market country, but it's evolved into some kind of mix of the worst aspects of crony capitalism, socialism, and big government.

Anonymous 01/15/25(Wed)19:50:55 No.103910878

>>103910845
>They didn't realize it was going to be as useful as it is.
yeah people often forget that transformers was made for translation, they had no idea it was useful for LLMs until gpt.

Anonymous 01/15/25(Wed)19:52:03 No.103910887

>>103910845
>Their internal testing showed that this was a shitty architecture, so they released the details on it.
With what? This is legit a gamechanger from regular transformers. If they had something better, they wouldn't publish this unless you think it's a bait to lure researchers in a bad direction and it may backfire. There is also no incentive to fake out research leads when you want an ecosystem of research around it.

Anonymous 01/15/25(Wed)19:52:10 No.103910891

>>103910878
>they had no idea it was useful for LLMs
it's useful for everything really, the transformers architecture is used everywhere now, image, video, sound...

Anonymous 01/15/25(Wed)19:57:26 No.103910949

>>103910845
>Their internal testing showed that this was a shitty architecture,
stop making shit up

Anonymous 01/15/25(Wed)19:58:28 No.103910958

I mean when you really think about it and realize that all the companies had filters that removed everything that had too many naughty words (Ayumu style)... It all makes sense? Models can kinda suck dick. But they really can't suck dick. They write about shivers, cascading hair, eye glints because that is the writing that barely made it past the filter. And as long as this is the reality there won't be a new release that suddenly fixes everything. This is over. It was always over.

Anonymous 01/15/25(Wed)19:59:46 No.103910969

>>103910887
Google is trying to compete with openAI, there is literally zero chance that they'd release a paper on something that's a gamechanger. The fact that they released this proves that it either has no potential, or they have something that works better than it.

Anonymous 01/15/25(Wed)20:00:15 No.103910974

>>103910845
>Their internal testing showed that this was a shitty architecture, so they released the details on it.
what do you mean? they released the paper by showing how great it is, why would they do that if they didn't want to share their moat to the world? Google is just a cool company when it comes to researsh, and that's from a big google hater, I hate them so much for ruining youtube, but I'm not blind, they contribute a lot of on the machine learning ecosystem

Anonymous 01/15/25(Wed)20:01:17 No.103910981

>>103910969
>Google is trying to compete with openAI
maybe google is trying to kill OpenAI so they give this knowledge for other companies to catch up with them

Anonymous 01/15/25(Wed)20:01:47 No.103910989

>>103910958
In 5-10 years if we don't get nuked/solar flared we may be able to train distributed model that is truly uncucked...

Anonymous 01/15/25(Wed)20:03:01 No.103911000

>>103910989
>we may be able to train distributed model that is truly uncucked
Nvdia will never allow us that lol

Anonymous 01/15/25(Wed)20:03:21 No.103911004

>>103910974
The way it works is that Google has to promise researchers that products will be made available and papers will be published otherwise they will threaten to leave the company. I have seen a video about it, I'm looking for it right now.

Anonymous 01/15/25(Wed)20:05:00 No.103911018

>>103911000
Did I say something about nvidia? What about chinese contraband GPUs?

Anonymous 01/15/25(Wed)20:05:20 No.103911022

>>103911018
no CUDA no party

Anonymous 01/15/25(Wed)20:05:50 No.103911028

>>103911018
Maybe in 40 years at this rate

Anonymous 01/15/25(Wed)20:06:27 No.103911035

>>103911022
>implying chinese won't copy/make their own

Anonymous 01/15/25(Wed)20:06:35 No.103911036

>>103911022
Nothing stops from them from making their cards CUDA compatible

Anonymous 01/15/25(Wed)20:06:53 No.103911040

>>103910969
Sure, and people are going to try this to validate it soon whether on if this actually bears out or not but I don't see the incentives again to mislead with a "failed" architecture when Google has a head start already incorporating this into Gemini.
If there is no moat like they say, there is no incentive to keep this hidden for too long because people are going to find out about it independently. Mamba, RKWV, and etc. listed in the paper is proof of that. Why would you let someone get the credit first publicly for this? That's more important than keeping a fundamental research like this locked up and there is still a lot of things you can gatekeep about this like tuning and etc. especially when you want more people to look at it and do research on it to benefit from it.

Anonymous 01/15/25(Wed)20:06:55 No.103911041

>>103911035
CUDA is a 18 years old software, nothing is close to that

Anonymous 01/15/25(Wed)20:07:03 No.103911043

>>103911018
I guess moore threads has some potential.

Anonymous 01/15/25(Wed)20:07:58 No.103911052

>>103911036
you mean like ZLUDA? that's hard as fuck to do, AMD has tried this for almost a decade now

Anonymous 01/15/25(Wed)20:08:49 No.103911062

>>103911036
This, a trillion dollar industry and they have had near 20 years to do it. They clearly lack the ability.
>>103911041

Anonymous 01/15/25(Wed)20:08:54 No.103911063

>>103910974
>>103910981
>maybe google is trying to kill OpenAI
If, and that's a huge if, the paper is actually showing something useful, then the only way I could see this "benevolent" company releasing it is if China stole it and they just want to put it out there before some Chinese company releases a paper instead.
>why would they do that if they didn't want to share their moat to the world
>Google is just a cool company when it comes to researsh
If they wanted to help the broader community then they would have released information about Gemini's context training.

Anonymous 01/15/25(Wed)20:10:31 No.103911081

>>103911052
Fucking up on purpose makes it hard yes.

Anonymous 01/15/25(Wed)20:10:46 No.103911085

>>103911052
AMD has an incentive to make sure ZLUDA never becomes practical

Anonymous 01/15/25(Wed)20:10:56 No.103911089

>>103910588
They want Meta to produce documentation of all the training data used in Llama 4 (and future Llama models) to make sure it isn't copyrighted, but Meta is dragging this out saying it would be too burdensome and that Llama 4 is still under development, not been released yet.

Anonymous 01/15/25(Wed)20:11:50 No.103911099

>>103911040
I'm not going to say that it's a "failed" architecture when compared to the current transformer models. I'm just saying that if this was some silver bullet that could put Google ahead of the competition then they wouldn't be releasing it. Either it's not going to make too much of an impact, or they already have something better.
I guess the other option is they released it without knowing how much of an impact it will have, and this is basically their transformers 2.0 moment.

Anonymous 01/15/25(Wed)20:11:50 No.103911100

>>103911028
Does your niggerbrain not comprehend the flow of time? 10 years ago we had gtx 980, and china was FAR behind, now they are closer than ever. 40 years ago I wasn't even born. IDK what kind of garbage computers oldfags had.

>>103911041
>>103911062
There was no necessity in the past. Now the sanctions are pushing for it.

Anonymous 01/15/25(Wed)20:12:51 No.103911109

>>103910734
>>103910812
32768 context fine?

Anonymous 01/15/25(Wed)20:13:52 No.103911119

>>103911100
Your a retard. The genius level wizards who can do this shit are poached for tens of millions a year by companies like nvidia. Not just anyone can do this shit otherwise they would have more competition in such a incredibly profitable market after all this time.

Anonymous 01/15/25(Wed)20:14:31 No.103911128

>>103907607
he's been cooking https://github.com/lucidrains/titans-pytorch/issues/2

Anonymous 01/15/25(Wed)20:15:31 No.103911138

>>103909393
Cool Miku

Anonymous 01/15/25(Wed)20:15:33 No.103911139

Anonymous 01/15/25(Wed)20:15:53 No.103911141

>>103907676
>Google is at their peak
I was like 90% sure they were done but they somehow turned it around.

Anonymous 01/15/25(Wed)20:15:59 No.103911145

>>103911109
Fine-ish. It might remember stuff from all over the context but might have issues connecting them.
There's people who also say that it makes the model dumber but I'm not so sure about that.
Basically, try it and see if it works for you since different use cases (lorebooks, vectordb, summarrization, quanted cache, etc) might have different outcomes resulting in conflicting reports.

Anonymous 01/15/25(Wed)20:18:21 No.103911169

403574604-22fc8aa9-c0a5-4127-a66a-6d3df12654b7

>>103911128
This looks pretty promising for reproducing the paper.

Anonymous 01/15/25(Wed)20:21:09 No.103911200

>>103907607
This is going over my head. Does this mean anything for the models we already have?

Anonymous 01/15/25(Wed)20:22:17 No.103911217

>to make sure it isn't copyrighted
yet hunyuan is trained on hollywood movies.that and uncensored hentai. lmao
they even reposted the avatar movie made by it after the drop.

i dont see any way this stuff like with meta and the eu laws will stay. that would mean the end for western models.
call me a chink lover whatever. can you imagine meta being this based? this is fucking tencent. bless those madlads. they repost audi car loras etc. not giving a shit about any copyright. thats how it should be.

Anonymous 01/15/25(Wed)20:24:22 No.103911236

>>103911217
Everyone in lmg basically assumes Chinese model supremacy.

Anonymous 01/15/25(Wed)20:24:40 No.103911237

>>103911145
I like to use group chats with a lot of lorebooks. What's the ideal context then?

Anonymous 01/15/25(Wed)20:24:42 No.103911238

800px-Demis_Hassabis%2C_2024_Nobel_Prize_Laureate_in_Chemistry_%28cropped%29

>>103911141
People underestimated this dude and forgot with the OpenAI hype what Deepmind did with AlphaGo with him at the helm. Deepmind pretty much now controls all of the ML work at Google now having won the internal politics fight over Jeff Dean with Google Research who was competing against him. This was bore out over the shit you saw with Bard and LaMDA and early versions of PaLM and you could see the improvement once Deepmind took the helm and basically took Google from nothing to top of the chats at the moment.

Anonymous 01/15/25(Wed)20:25:06 No.103911246

>>103911236
As long as they dont cuck too hard considering porn is literally illegal in china

Anonymous 01/15/25(Wed)20:25:24 No.103911250

>>103911237
16k is the safe bet seems to be the closest thing to a consensus.

Anonymous 01/15/25(Wed)20:26:07 No.103911259

>>103911217
Hot. Can I run that on my machine?

Anonymous 01/15/25(Wed)20:27:15 No.103911265

>>103911200
Basically 6 months at least before this shows up in anything large.

Anonymous 01/15/25(Wed)20:27:17 No.103911266

>>103911259
Yes, its been out of awhile now

Anonymous 01/15/25(Wed)20:28:46 No.103911288

>>103910224
Yeah I agree. I actually have a Narrator card that does an excellent job because of the {{name}} thing, so the model will tend to take a more distant perspective (especially if you manually adjust the first several) and write very long responses with dialogue from multiple characters. Sometimes that's exactly what I'm looking for, and I tend to use character cards vs user to add perspective during moments. Admittedly this perspective shift can fuck things up over a long context, but considering my longest story reached about a thousand messages and roughly 24-32k in context generally I'd say I can manage to keep a story on track under pretty much any circumstances. But yeah, the lorebook approach has allowed me to actually create a unique world with multiple GCs, a map I drew in GIMP (and eventually wanna figure out img2img and make a proper fantasy map out of, but in all my time with this hobby I've never spent more than an hour or two in imagegen), and an actual timeline with historical figures and empires that are relevant or ancient history depending on whether I'm playing fantasy or cyberpunk in the same world. Being able to drop in info at a specific depth and then remove it when it's not relevant so easily and quickly is the only way I'm able to keep these things in the worlds I'm writing. Now I just need to read more books, because I can't afford to finetune a model but I can at least finetune my brain a touch.

Anonymous 01/15/25(Wed)20:29:23 No.103911297

>>103911259
sure, apparently works even with 6gb of vram according to the nip.
i suspect the usual huge waiting times though. thats the main reason for me why i dont actually use video models locally.

Anonymous 01/15/25(Wed)20:29:39 No.103911300

>>103911238
Yep. If you look at his biography on Wikipedia, Demis Hassabis is legit a 0.1% genius. I would not be surprised if he becomes CEO of Alphabet replacing Sundar.

Anonymous 01/15/25(Wed)20:30:39 No.103911312

>>103911250
How do I notice any negative effects associated with having the context too high? What to expect?

Anonymous 01/15/25(Wed)20:32:03 No.103911325

>>103911297
With fastvideo lora and wavespeed: https://github.com/chengzeyi/Comfy-WaveSpeed?tab=readme-ov-file#dynamic-caching-first-block-cache
Its not that slow really as long as you dont do full 720P videos. 360P takes like 20 seconds with 4090

Anonymous 01/15/25(Wed)20:33:03 No.103911340

>>103911238
Rolling for gemma 3 beating meta

Anonymous 01/15/25(Wed)20:33:10 No.103911341

>>103911325
I have a 4090, I want 1440p uncensored hentai videos. How long for that?

Anonymous 01/15/25(Wed)20:34:11 No.103911352

>>103911341
It was trained up to 720P. It increases quadratically

Anonymous 01/15/25(Wed)20:34:44 No.103911359

>>103911246
that's the most insane part, porn is illegal in China yet this model is completly uncensored and knows how to do porn perfectly

Anonymous 01/15/25(Wed)20:37:50 No.103911396

>>103911341
24gb isn't enough for that resolution, I have a 24gb card and for a 5 sec video my limit is at 576x576

Anonymous 01/15/25(Wed)20:37:54 No.103911397

>>103910256
>Ideally they should be the same, but it will work as long as they run the same driver.
Good to know, thanks.
One more question, is AI software for AMD easier to set up on Linux or Windows?

Anonymous 01/15/25(Wed)20:38:36 No.103911404

>>103911312
I always keep an eye out for contradictions and excessive repetition, not just of words but response patterns.
There's also the model's "intelligence" but that can be harder to gauge since it's not a huge immediate drop off.
You could simply roleplay as normal, and when you notice some oddness cut the context down to 16 or 8k and regen as a sanity test.

Anonymous 01/15/25(Wed)20:38:37 No.103911405

I'm getting back into training after a long break and read a few papers on training params. I remember some anon here saying that GA "made tokens blurry". Turns out it was buggy this whole time: https://huggingface.co/blog/gradient_accumulation

Anonymous 01/15/25(Wed)20:39:56 No.103911417

>>103911396
5 seconds?! Bro I need at least 10 minute!

Anonymous 01/15/25(Wed)20:40:50 No.103911425

>>103911417
kek I'm not sure that'll happen in our lifetime

Anonymous 01/15/25(Wed)20:41:15 No.103911433

>>103911396
I have a 24GB card and can do 720P, are you using sageattention and FP8?

Anonymous 01/15/25(Wed)20:41:55 No.103911442

>>103911425
Can't they just save progress to disk as it goes along instead of keeping the entire thing in memory?

Anonymous 01/15/25(Wed)20:42:23 No.103911449

>>103911431
>>103911431
>>103911431

Anonymous 01/15/25(Wed)20:42:27 No.103911450

>>103911217
>yet hunyuan is trained on hollywood movies
it's true, China doesn't give a fuck about copyright, god bless the chinks
https://xcancel.com/TXhunyuan/status/1863889762396049552#m

Anonymous 01/15/25(Wed)20:43:28 No.103911464

>>103911433
yeah I'm using sageattention and Q8 with the comfy core workflow, like I said I go for 5 sec so 129 frames

Anonymous 01/15/25(Wed)20:44:53 No.103911479

>>103911288
i used to like creating lorebooks out of what the ai would come up with but rag lets me be much lazier. it can mess up details though which is why you still want a lorebook for details like characters, locations. but it gives the ai extra data to exact from when you want the next quest or scene to happen.

Anonymous 01/15/25(Wed)20:45:48 No.103911485

>>103911405
Yeah the lads from unsloth fixed it some time ago, it's incredible how bad the transformers library is.

Anonymous 01/15/25(Wed)20:45:51 No.103911486

>>103911442
Not that anon, but I imagine that the previous frames work like the context for LLMs, so probably not. Unless you want to truncate the context to the last N frames and have the video morph into insanity as it generates.

Anonymous 01/15/25(Wed)20:45:53 No.103911488

>>103911405
>made the tokens feel blurry
Pretty sure that was me about a year or more ago, partly as a joke, but still half serious since it felt like my loras weren't picking up details. I was watching the issue for a while, and was surprised this wasn't caught sooner.

Anonymous 01/15/25(Wed)21:33:38 No.103911971

>>103910958
>filters that removed everything that had too many naughty words
Is this the reason why everything is slopped shivers?

Anonymous 01/15/25(Wed)21:47:36 No.103912089

Why is meta specifically targeted with that lawsuit? Why isn't openai also being hit for using stolen material?

Anonymous 01/15/25(Wed)22:14:03 No.103912317

>>103911479
I need to start using RAG, because that's the main thing I'm missing - sometimes prefills are boring and I want it to pull from an idea or two for an event to happen, but badly enough that it does something unexpected with it. Gotta check if mistral large supports it or not

Anonymous 01/15/25(Wed)22:24:54 No.103912426

>>103907592
denounce circumcision, robot

Anonymous 01/15/25(Wed)22:38:01 No.103912538

>>103912089
They're all getting sued. OpenAI has their own lawsuits going on. The current meta one just blew up because the court got their hands on internal communications where meta employees torrented the libgen library and filtered out copyright notes.