AI & ML interests

None defined yet.

Recent Activity

dippatel1994Β 
posted an update 4 days ago
view post
Post
997
To make revising LLM architectures and training methods faster, I created a deck of 180 visual flashcards. It started as a personal hobby, but slowly became cheat code for reviewing LLM concepts before technical interviews. People love it!

Swipe through these samples, and if you want to grab the full set or follow the project, the repo is here: https://github.com/llmsresearch/llm-flashcards.
bsheppΒ 
posted an update 23 days ago
view post
Post
173
A dead 2013 Butterfly Labs "Jalapeno" SHA-256 mining ASIC sat in a drawer for a decade. It became the excuse for a small, careful question: how much structure can a tiny, cheap model learn in SHA-256, and how would I know if I were fooling myself? (The ML runs on CPU and a HF job, not the ASIC; the dead miner is just the origin story.)

Three findings, written up honestly:

1. A sharp round-4 cliff. Round-reduced SHA-256 is ~100% distinguishable through 3 rounds, then collapses to chance at round 4 and stays there out to the full 64. Reproduced across 5 seeds.

2. A controls-gated bounded null on full SHA-256: no learnable structure above a ~0.22% resolution floor at n=4,000,000. That is a bounded null at this budget, not a claim that SHA-256 is random.

3. A "signal" in the iterated-hash dynamics that a permuted-label control unmasked as a label-prior artifact. The instrument caught its own false positive. That was the point of building the controls.

Negative results, stated with their resolution. The dataset carries the controls on every row.

Dataset: bshepp/round-reduced-sha256-learnability
Code (MIT) + full writeup: https://github.com/bshepp/bfl-asic
satpalsrΒ 
posted an update 26 days ago
view post
Post
188
We're open-sourcing our infra with 10M+ frames of dataset!

We're releasing Stera, an open-source infra that turns an off-the-shelf device in your pocket into a high-fidelity multimodal data pipeline. It's built around four layers. Capture β†’ Process β†’ Evaluate β†’ Export.

Stera Capture removes the need for bespoke/gated hardware and runs on an off-the-shelf iPhone. It fuses together synchronized RGB, IMU, Lidar-guided depth, and 6-DoF pose out of the box from ARKit and exports them to a raw MCAP file.

Dataset: fpvlabs/stera-10m
Launch Details: https://x.com/fpv_labs/status/2055262652033908832
merveΒ 
updated a Space 28 days ago
ZennyKennyΒ 
in blog-explorers/README about 1 month ago

🚩 Report: Spam

1
#19 opened about 1 month ago by
ccocks-deca
ccocks-decaΒ 
in blog-explorers/README about 1 month ago

🚩 Report: Spam

1
#19 opened about 1 month ago by
ccocks-deca
apehexΒ 
in blog-explorers/README about 1 month ago
RiverRiderΒ 
in blog-explorers/README about 1 month ago
Yann-CVΒ 
posted an update about 1 month ago
view post
Post
503
πŸš€ Introducing Goldener: The Python Data Orchestrator for more efficient ML

Machine Learning workflows often rely on randomness: selecting/splitting data for training, batching it variably, and monitoring real-world performance.

Nowadays, foundation models give access to the semantics of data. Goldener leverages this semantic to make the entire ML lifecycle more efficient!

πŸ”— Check it out: https://github.com/goldener-data/goldener
πŸ”¨ Give it a try: pip install goldener
intrectΒ 
posted an update about 2 months ago
view post
Post
174
I’m excited to share a new paper I recently posted on arXiv: ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics.

This work asks a simple question: can AI-generated music be detected not only by style, but by the physical artifacts left behind during generation?

ArtifactNet approaches the problem from that angle. Instead of only learning what AI music sounds like on a fixed benchmark, it analyzes forensic residual patterns linked to neural audio codec bottlenecks such as residual vector quantization (RVQ).

In our experiments, ArtifactNet achieved F1 = 0.9829 on a zero-overlap multi-generator benchmark spanning 22 AI generators and 6 real-music sources, while using only 4.0M parameters. Under the same evaluation setting, larger prior models showed substantial degradation on out-of-distribution generators and real-music false positives.

I also introduced ArtifactBench, a broader evaluation benchmark designed to stress-test detector robustness across unseen generators, diverse real sources, hard negatives, and codec conditions.

This was a deeply rewarding project at the intersection of audio forensics, MIR, and generative model evaluation.

https://arxiv.org/abs/2604.16254
  • 1 reply
Β·