More

mariarmestre · on Aug 7, 2024

Hmmm... this is not eligible for their zero data retention policy anymore. Not sure how this will go down.

kevlened · on Aug 7, 2024

They added a clause that the schemas themselves aren't covered by ZDR. Their policy for the prompts appears unchanged.

https://platform.openai.com/docs/models/default-usage-polici...

mariarmestre · on May 30, 2024

Are you going to keep maintaining the package?

saigal · on May 30, 2024

we intend to

mariarmestre · on May 16, 2024

I just tried it on a AI newsletter: https://a.tldrnewsletter.com/web-version?ep=1&lc=f08d3180-da... and it turns out I'm not keeping up with AI news as well as I thought.

mariarmestre · on May 13, 2024

Sorry, this is the link: https://community.intel.com/t5/Blogs/Tech-Innovation/Cloud/T...

dang · on May 14, 2024

Our software changes submitted links to canonical URLs when it finds them, and that page has the canonical URL https://community.intel.com/t5/Blogs/Tech-Innovation/Cloud/b....

I've fixed it above now.

mariarmestre · on March 11, 2024

Thank you! Our goal is to make production-ready code, so we believe that good documentation and stability of the code are paramount. I'll pass the feedback along the rest of the team :)

mariarmestre · on March 11, 2024

This represents a major rewrite of the package with more powerful features than ever.

We have introduced the new concept of components. Components are composable and customisable and can be connected into pipelines. Pipelines are dynamic execution graphs that support a range of flows from simple linear chains to more complex execution flows containing loops. This means, you can get started easily with a few lines of code, but have room for extending and customising the logic of the pipeline.

This restructuring of the package paves the way to building truly extensible and composable AI systems ready for production.

The team will be around for questions!

mariarmestre · on March 4, 2024

Thank you for this! Congrats for the beta release. I guess this is not totally battle-tested yet then?

jgrahamc · on March 4, 2024

No, but it's a real product.

mariarmestre · on Dec 20, 2023

I think a fundamental issue with search, and the reason why many companies do not invest in tuning a good search experience, is that the main metric usually is to minimise embarrassing/irrelevant results, rather than get the best possible set of results. How can you even know what is the best answer to your query? Systematic evaluation is very hard.

TimPC · on Dec 20, 2023

If you control the browser your results are in you can monitor clicks and time spent on document to generate pretty good signal. If someone opens a document and looks at it for fifteen minutes you should be fairly convinced it was useful.

mariarmestre · on Dec 18, 2021

The only issue here is what I mentioned in the comment above: how easy is it to read and parse the content of said website and is it legal to read the content programmatically? Do you have any website(s) in mind?

mariarmestre · on Dec 18, 2021

Thanks so much for your comment! You're right that this annotation tool can be used on any form of free-form documents found online. I tackled Wikipedia first because it was an obvious first choice and they have an API to read the html. This could be opened to other sources of data, but I also do not want this to become a scraping tool, so we would need to weigh costs/benefits of adding new data sources. The additional cost of adding a new source is mostly about how difficult it is to read and parse the content. In the future, I could integrate with some paying sources (e.g. news publications), where people have to pay for the content they scrape & label.

I have a pitch deck and I'm looking for all the things you mentioned :-). I can send the pitch deck to anyone interested.