Category: living

Use It or Lose It Notes

Because I mostly don’t use it, and then end up losing it. This is my living blog of quick, forgotten patterns. Not profound, just practical. Table of Contents Spark Java Spark 1. Create a SparkSession (Boilerplate I always forget) SparkSession spark = SparkSession.builder() .appName("UILI") .master("local[*]") .getOrCreate(); 2. Create a Dataset from Strings (not from files) Dataset<String> ds = spark.createDataset( Arrays.asList("Abc", "xyz"), Encoders.STRING() ); 3. SparkConf and RDD creation options SparkConf conf = new SparkConf(). Read more...

The Blog of Blogs

List of Links Slack Architecture G1GC Prompt Engineering White Paper Growth Map - Non Checklist 🔭 Concepts to Explore Later Hybrid Search (Sparse + Dense Retrieval) Combine traditional keyword search (like TF-IDF/BM25) with embeddings for better relevance, especially in enterprise search. Vector Databases (FAISS, Pinecone, Weaviate, Milvus) Each has trade-offs in latency, scalability, and integrations. Worth exploring for hands-on projects. Document Chunking Strategies How to split large docs into semantically meaningful chunks before embedding — affects RAG accuracy. Read more...