Data Structure for Knowledge = Language Models + Structured Data - DRT S2E13

Published on ● Video Link: https://www.youtube.com/watch?v=ICZ0ShZ6ud0



Duration: 48:54
194 views
4


00:16 Get to know Dr. Joel Chan
04:43 Joel's PhD in Cognitive Science and Transition to Human Computer Interface
07:54 Exploring a patent search engine
09:52 The main problem statement: how to organize artifacts to accelerate creativity
12:05 Practitioners often look for artifacts related to similar tasks to their use case
12:48 Transition from knowledge repository model to expertise sharing and collaborative work
17:14 Sharing artifacts with coworker without enough contextual knowledge (eg. tacit) is not helpful
18:10 Area of focus; data models for collaborative knowledge work: human expertise + externalized artifacts + unstructure data
24:27 Mining data on how people work is one way to create systems for knowledge sharing
25:05 Slack as an example of quick recontexualization of knowledge: human interactions enriched with resources and context
29:40 Creating systems that forces people into certain behavior templates doesn't work (unless it's for tasks that are very low variance)
32:31 Designing the perfect externalized knowledge artifact and process really depends on the context of how people work and what they find useful
33:08 How would you get around rigidity of "expert systems" - you only need enough structure to provide value even if the system is a bit "scruffy"
34:36 System requirements for an ideal collaborative knowledge work system - reusability & recontexuality, Low effort maintenance, Just in time availability + factuaity
38:32 Factuality and large language models for products in highly technical areas
44:54 Wrap up and verdicts

KEY TAKEAWAYS:
1. After research in Cognitive Science in grad school, Joel worked on creating a patent search engine that utilized text analysis and text mining systems. The goal was to add structure to unstructured full text and find interesting analogies in the patent databases.
2. His focus then shifted to broader thinking about data structures, with the importance of understanding not just the problem and solution, but also the situations, constraints, and phenomena involved.
3. The problem of adding structure and organization to collections of people, artifacts, resources, and patterns is important in order to enable efficient collaborative work.
4. People have experimented with both automated text mining and human annotation crowdsourcing to create "expert systems" with limited satisfactory results.
5. Externalizing knowledge completely and objectively is unlikely; so the best use case for knowledge artifacts is to facilitate a negotiation between tacit knowledge of collaborators and externalized artifacts, leading to the curation of knowledge.
6. Slack is a tool used by developers every day to communicate with each other. It is a great example of a multiplayer HCI where "expertise sharing" (asking questions) & "knowledge sharing" (via artifacts) is deeply integrated
7. Recontextualization & reuse: The ability to easily recontextualize information that was previously de-contexualized when documented is a key factor in the design of a knowledge system. It is important to be able to pick up the information and adapt it to a new use case.
8. Maintenance and availability: It is important to have a system that can organize itself over time without much effort from the users / operators. It should also be available for just in time insight.
9. Factuality: Being factually grounded is an important factor when designing a system that helps users navigate technical knowledge. It is essential to be able to judge the extent to which a claim is true and have a system that can help verify through provenance and evidence.
10. The addition of large language models to the system design is an interesting development, but it comes at the cost of hallucination (at least in short term and in absence of integration with reliable information sources).
11. There are design questions on how to translate latent structures into machine usable structures that can be fed into information retrieval algorithms, which is a major focus in the field. However, ultimately the important factor is what workflow feels intuitive to knowledge workers and what they find useful
12. The most reliable way to deisgn such system is "integrated crowdsourcing" where knowledge work structure is "mined from observing people" who are already motivated to do the work (social, process, and artifact use aspects).




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2023-03-22Leveraging Language Models for Training Data Generation and Tool Learning
2023-03-22Generative AI: Ethics, Accessibility, Legal Risk Mitigation
2023-03-22Incorporating Large Language Models into Enterprise Analytics
2023-03-22Integrating LLMs into Your Product: Considerations and Best Practices
2023-03-22Commercializing LLMs: Lessons and Ideas for Agile Innovation
2023-03-22The Emergence of KnowledgeOps
2023-02-28Neural Search for Augmented Decision Making - Zeta Alpha - DRT S2E17
2023-02-21Distributed Data Engineering for Science - OpSci - Holonym - DRT S2E16
2023-02-14Data Products - Accumulation of Imperfect Actions Towards a Focused Goal - DRT S2E15
2023-02-07Unfolding the Maze of Funding Deep Tech; Metafold - DRT S2E14 - Ft. Moien Giashi, Alissa ross
2023-01-31Data Structure for Knowledge = Language Models + Structured Data - DRT S2E13
2023-01-25EVE - Explainable Vector Embeddings - DRT S2E12
2023-01-17LabDAO - Decentralized Marketplace for Research in Life Sciences - DRT S2E11
2023-01-10Data-Driven Behavior Change and Personalization - DRT S2E10
2022-12-20ChatGPT - the Chatbot that Follows Instructions - DRT S2E9
2022-12-16Investing in Deep Tech - Investor's Angle; Deep Random Talks S2E8 - Ft. Moien Giashi, Amir Feizpour
2022-12-09Modern Knowledge Management in 2022 - Deep Random Talks S2E7
2022-12-02TalentDAO- How does decentralized scientific publishing work - Deep Random Talks S2E6
2022-11-25Evaluating Performance of Large Language Models with Linguistics - Deep Random Talks S2E5
2022-11-18Second Brain for Technical Knowledge Management- DRT S2 E4
2022-11-14What are the future plans of Foodshake?



Tags:
deep learning
machine learning