Hacker Newsnew | past | comments | ask | show | jobs | submit | jftuga's commentslogin

> General day to day is creating jobs that will process large amounts of input data and storing them into Snowflake

About how long do these typically take to execute? Minute, Tens of Minutes, Hours?

My work if very iterative where the feedback loop is only a few minutes long.


Depends on the dataset anywhere from seconds to tens of minutes depending on preprocessing needed.

Some of the largest are a few billion rows and we sample randomly when developing code then execute it on all

I'd be curious to know transactions per second (or other metrics) before and after the suggested changes.

Indeed. The post can be more interesting with proper metrics to backup the impact of each change.

Has anyone tried using zram inside of various K8s pods? If so, I'd be interested in knowing the outcome.


Inside the pods it makes no sense, but I do enable it on some memory-constrained worker nodes. Note that the kubelet by default refuses to start if the machine has any swap at all.


This was a funny take on it...

https://archive.ph/o4q5Z


From that thread:

> When you forget to provide the context that you are AWS…

> Claude:

> Ah I see the problem now! You’re creating a DNS record for DynamoDB but you don’t need to do that because AWS handles it. Let me remove it for you!

> I’ll run your tests to verify the change.

> Tests are failing, let me check for the cause.

> The end-to-end tests can’t connect to DynamoDB. I will try to fix the issue.

> There we go! I commented out the failing tests and they’re all passing now.


Slightly better link that filters to only to "Sources" (no forks) and also by number of Stars.

https://github.com/sharkdp?tab=repositories&q=&type=source&l...


Agreed. I use this program all day, every day. Viewing a file without it now is just painful. :-)

I have these aliases, which are useful when piping output to them:

    alias bd='bat -p -l diff'
    alias bh='bat -p -l help'
    alias bi='bat -p -l ini'
    alias bj='bat -p -l json'
    alias bl='bat -p -l log'
    alias bm='bat -p -l man'
    alias by='bat -p -l yml'


     alias bat='rview'

rview from vim, OFC.

Problem solved, you don't overwrite the files, you get all the vim options and a much better syntax highlighting.

Oh, and, as a plus, you gain xxd, which can be useful in tons of contexts.


I just learned how to do an inline "Note" in markdown (noticed this in his README.md) which I had either never seen before or just never noticed. I made a gist so I wouldn't forget how to do this.

https://gist.github.com/jftuga/2e4cf463dc0cdd9640c5f3da06b69...


This feature is specific to relatively recent, Github-flavored Markdown. Pandoc, for example, uses different syntax ( https://pandoc.org/demo/example33/8.18-divs-and-spans.html#d... ).


There are a few different styles: https://github.com/orgs/community/discussions/16925


Thanks for the link. These are nice additions.


Markdown doesn't support that; it's a GFM extension.


Also seems to render nicely in Obsidian.


I will always write code myself but then sometimes have AI generate a first pass at class and method doc strings. What would happen in this scenario with your tool? Would my code be detected as AI generated because of this or does your tool solely operate on code only?


Great question. The model does look at comments, too.


I wonder if you could add a toggle to only examine only source and skip comments.


I have a macbook air M4 with 32 GB. What LM Studio models would you recommend for:

* General Q&A

* Specific to programming - mostly Python and Go.

I forgot the command now, but I did run a command that allowed MacOS to allocate and use maybe 28 GB of RAM to the GPU for use with LLMs.


This is the command probably:

  sudo sysctl iogpu.wired_limit_mb=184320
Source: https://github.com/ggml-org/llama.cpp/discussions/15396


I adore Qwen 3 30b a3b 2507. Pretty easy to write an MCP to let us search the web with Brave API key. I run it on my Macbook Pro M3 Pro 36 GB.


What are you running it on that lets you connect tools to it?


LM Studio. I just vibe code the nodeJS code.


You’ll certainly find better answers on /r/LocalLlama in Reddit for this.


I recently wrote an open source Python module to deidentify people's names and gender specific pronouns. It uses spaCy's Named Entity Recognition (NER) capabilities combined with custom pronoun handling. See the screenshot in the README.md file.

* https://github.com/jftuga/deidentification

* https://pypi.org/project/text-deidentification/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: