Hacker News new | past | comments | ask | show | jobs | submit | more ccheney's comments login

We need to start ingesting raw scientific data through these models and see what it comes up with. What could these models identify by parsing through raw JWST or Hubble data? Or training against every published scientific paper? Is anyone doing this sort of thing already?


Meta's Galactica was an attempt to train an LLM predominantly on scientific papers, articles and so on. It failed pretty spectacularly but Galactica 2, if that's ever a things, might rectify that.


GP likely means training transformers on raw data (similar to protein folding transformers) to find patterns that humans cannot (due to lack of context, bias, or whatever).

Problem with the assumption though is that transformers are good at identifying and replicating patterns given a set of rules (i.e. how proteins fold and misfold depending on the environment).

Hubble data isn’t so much “we know the rules but not their interactions” as much as “we don’t really know the full set of rules,” so that particular example probably wouldn’t be that fruitful.

In general, biology (where we understand the basic rules but not the complex ways they are combined) is the most fertile ground for transformer driven research.


Seems like a path to fingerprinting users for tracking purposes and a potential vector for data leaks


The one thing this proposal does right is trying to avoid fingerprinting.


Geohot is working w/ AMD and essentially pleading with them to fix their driver stacks - check out his interview on Latent Space


I'm pretty sure geohot gave up on his AMD plan pretty quickly.

https://news.ycombinator.com/item?id=36189705


He flamed out, then is back after Lisa Su called him (lmao)

https://geohot.github.io/blog/jekyll/update/2023/05/24/the-t...

https://www.youtube.com/watch?v=Mr0rWJhv9jU

https://github.com/RadeonOpenCompute/ROCm/issues/2198#issuec...

https://geohot.github.io/blog/jekyll/update/2023/06/07/a-div...

On a personal level that youtube doesn't make him come off looking that good... like people are trying to get patches to him and generally soothe him/damage control and he's just being a bit of a manchild. And it sounds like that's the general course of events around a lot of his "efforts".

On the other hand he's not wrong either, having this private build inside AMD and not even validating official, supported configurations for the officially supported non-private builds they show to the world isn't a good look, and that's just the very start of the problems around ROCm. AMD's OpenCL runtime was never stable or good either and every experience I've heard with it was "we spent so much time fighting AMD-specific runtime bugs and specs jank that what we ended up with was essentially vendor-proprietary anyway".

On the other other hand, it sounds like AMD know this is a mess and has some big stability/maturity improvements in the pipeline. It seems clear from some of the smoke coming out of the building that they're cooking on more general ROCm support for RDNA cards, and generally working to patch the maturity and stability issues he's talking about. I hate the "wait for drivers/new software release bro it's gonna fix everything" that surrounds AMD products but in this case I'm at least hopeful they seem to understand the problem, even if it's completely absurdly late.

Some of what he was viewing as "the process happening in secret" was likely people doing rush patches on the latest build to accommodate him, and he comes off as berating them over it. Again, like, that stream just comes off as "mercurial manchild" not coding genius. And everyone knew the driver situation is bad, that's why there's notionally alpha for him to realize here in the first place. He's bumping into moneymakers, and getting mad about it.


It's kinda strange seeing some people link advancements in transformer models with stuff like cryptos and NFTs. You've gotta ask, where are these thoughts coming from? Plus, the ongoing use of the 'stochastic parrot' argument is starting to feel a bit repetitive in these discussions.


The irony is that 99% of them haven't actually read the paper the term originated from.

One of it's only recurring warnings was the danger of underestimating what a "stochastic parrot" can do in practical terms.


Seems shortsighted and overly limiting to me. Perhaps in this specific case it makes sense?

What's the difference?

A) Human creates artwork in the style of [insert artist here]

B) Computer creates artwork in the style of [insert artist here]

Both "trained" against existing copyrighted works except one is human. Is this to "save jobs"?


Any mirror?

Edit:

The Bill https://legislation.nysenate.gov/pdf/bills/2023/S6686

gpt-4 summary:

The bill, numbered 6686, was introduced by Sen. Hoylman-Sigal and is related to social media open application programming.

The bill proposes amendments to the general business law, specifically introducing a new article 42 titled "Social Media Open Application Programming Interface Access". The article includes sections defining terms, requirements for open API access with social media for third-party development tools, API access reports, violations and remedies, and application of the law.

Key points from the bill include:

Social media platforms are required to implement and maintain a standards-based API that permits third-party applications to retrieve data at no cost, for the benefit of the user or the user's authorized representative.

The bill outlines the type of data that must be provided to users or their authorized representatives through the API. This includes personal data controlled by the social media platform, platform-produced or recommended data available to the user, and data that is generally available to all account holders about the user's friends or followers. Social media platforms must conduct routine testing, ongoing monitoring, and make all necessary updates to ensure the API functions properly.

The bill also outlines circumstances under which a social media platform may deny or discontinue a user or authorized representative's application's access to the API.

Social media companies are required to submit semiannual API access reports to the attorney general. These reports should include information on API utilization and access decisions.


Someone must've copied their node_modules folder into Google Drive by mistake...


  "Thee proplem weth a systhemateck, fhoneteck chaigne en 
  Englush speleeng es thet et raeses thee questeon uv wut 
  aksent to faivor. Fer an Americon en Oheo, thees 
  transkriptshon probubly maeks sum sens, wunce yu get paest 
  thee skwaw. I daut an Engleshman or eaven a Nue Yorcer wud 
  fynd et ueseful."
please rewrite it but this time with a Southern drawl

  "Th' proplem weth a systh'mateck, fhoneteck chaigne en 
  Aynglsh speln iz thayt et raeses th' questyun uv wut aksent 
  ta fayver. Fer a Amur'can down yonder in Ohayo, thees 
  transkripshun prahbly maeks sum sens, wunce y'all git past 
  th' skwaw. Ah daut an Engleshman or ev'n a Nue Yorcer wud 
  fynd et much ueseful."

  - GPT-4 / https://sharegpt.com/c/Q5DW1wg


EDIT: this specific issue is resolved

Failing for us in GitHub Actions

For SEO purposes:

  npm ERR! code ERR_TLS_CERT_ALTNAME_INVALID
  npm ERR! errno ERR_TLS_CERT_ALTNAME_INVALID
  npm ERR! request to https://pkg- 
 npm.githubusercontent.com/npmregistryv2prod/blobs/\*\* failed, reason: 
  Hostname/IP does not match certificate's altnames: Host: pkg-npm.githubusercontent.com. is not in the cert's altnames: DNS:\*.githubassets.com, DNS:githubassets.com


I'm curious to see what comes of V2G tech and other home battery systems. I know that Panasonic[1] & Enphase[2] have their own systems rolling out. Adding these systems to new home builds makes a lot of sense to me in the interest of future-proofing.

Here's an interesting example where a new housing development in Las Vegas built all the houses with battery storage as well as solar generation for the entire neighborhood[3]

[1] https://na.panasonic.com/us/energy-solutions/battery-storage...

[2] https://enphase.com/store/storage/iq-battery-10

[3] https://electrek.co/2022/12/08/tesla-neighborhood-launches-p...


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: