redis migration

Signed-off-by: ale <ale@manalejandro.com>
2025-12-15 17:43:08 +01:00
commit 4d9545d0ec
--- a/README.md
+++ b/README.md
@@ -1,9 +1,9 @@
 # Hasher 🔐

-A modern, high-performance hash search and generation tool powered by Elasticsearch and Next.js. Search for hash values to find their plaintext origins or generate hashes from any text input.
+A modern, high-performance hash search and generation tool powered by Redis and Next.js. Search for hash values to find their plaintext origins or generate hashes from any text input.

-![Hasher Banner](https://img.shields.io/badge/Next.js-16.0-black?style=for-the-badge&logo=next.js)
-![Elasticsearch](https://img.shields.io/badge/Elasticsearch-8.x-005571?style=for-the-badge&logo=elasticsearch)
+![Hasher Banner](https://img.shields.io/badge/Next.js-15.4-black?style=for-the-badge&logo=next.js)
+![Redis](https://img.shields.io/badge/Redis-7.x-DC382D?style=for-the-badge&logo=redis)
 ![TypeScript](https://img.shields.io/badge/TypeScript-5.x-3178C6?style=for-the-badge&logo=typescript)

 ## ✨ Features
@@ -11,7 +11,7 @@ A modern, high-performance hash search and generation tool powered by Elasticsea
 - 🔍 **Hash Lookup**: Search for MD5, SHA1, SHA256, and SHA512 hashes
 - 🔑 **Hash Generation**: Generate multiple hash types from plaintext
 - 💾 **Auto-Indexing**: Automatically stores searched plaintext and hashes
- 📊 **Elasticsearch Backend**: Scalable storage with 10 shards for performance
+- 📊 **Redis Backend**: Fast in-memory storage with persistence
 - 🚀 **Bulk Indexing**: Import wordlists via command-line script
 - 🎨 **Modern UI**: Beautiful, responsive interface with real-time feedback
 - 📋 **Copy to Clipboard**: One-click copying of any hash value
@@ -32,8 +32,8 @@ A modern, high-performance hash search and generation tool powered by Elasticsea
       │
       ↓
 ┌─────────────┐
-│Elasticsearch│ ← Distributed storage
-│ 10 Shards   │   (localhost:9200)
+│    Redis    │ ← In-memory storage
+│             │   with persistence
 └─────────────┘
 ```

@@ -42,7 +42,7 @@ A modern, high-performance hash search and generation tool powered by Elasticsea
 ### Prerequisites

 - Node.js 18.x or higher
- Elasticsearch 8.x running on `localhost:9200`
+- Redis 7.x or higher
 - npm or yarn

 ### Installation
@@ -58,20 +58,28 @@ A modern, high-performance hash search and generation tool powered by Elasticsea
   npm install
   ```

-3. **Configure Elasticsearch** (optional)
+3. **Configure Redis** (optional)
   
-   By default, the app connects to `http://localhost:9200`. To change this:
+   By default, the app connects to `localhost:6379`. To change this:
   
   ```bash
-   export ELASTICSEARCH_NODE=http://your-elasticsearch-host:9200
+   export REDIS_HOST=localhost
+   export REDIS_PORT=6379
+   export REDIS_PASSWORD=your_password  # Optional
+   export REDIS_DB=0                     # Optional, defaults to 0
   ```

-4. **Run the development server**
+4. **Start Redis**
+   ```bash
+   redis-server
+   ```
+
+5. **Run the development server**
   ```bash
   npm run dev
   ```

-5. **Open your browser**
+6. **Open your browser**
   
   Navigate to [http://localhost:3000](http://localhost:3000)

@@ -100,6 +108,9 @@ npm run index-file wordlist.txt
 # With custom batch size
 npm run index-file wordlist.txt -- --batch-size 500

+# Resume from last position
+npm run index-file wordlist.txt -- --resume
+
 # Show help
 npm run index-file -- --help
 ```
@@ -117,7 +128,23 @@ qwerty
 - ✅ Progress indicator with percentage
 - ✅ Error handling and reporting
 - ✅ Performance metrics (docs/sec)
- ✅ Automatic index refresh
+- ✅ State persistence for resume capability
+- ✅ Duplicate detection
+
+### Remove Duplicates Script
+
+Find and remove duplicate hash entries:
+
+```bash
+# Dry run (preview only)
+npm run remove-duplicates -- --dry-run --field md5
+
+# Execute removal
+npm run remove-duplicates -- --execute --field sha256
+
+# With custom batch size
+npm run remove-duplicates -- --execute --field md5 --batch-size 100
+```

 ## 🔌 API Reference

@@ -158,6 +185,7 @@ Search for a hash or generate hashes from plaintext.
  "found": true,
  "isPlaintext": true,
  "plaintext": "password",
+  "wasGenerated": false,
  "hashes": {
    "md5": "5f4dcc3b5aa765d61d8327deb882cf99",
    "sha1": "5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8",
@@ -171,52 +199,60 @@ Search for a hash or generate hashes from plaintext.

 **GET** `/api/health`

-Check Elasticsearch connection and index status.
+Check Redis connection and database statistics.

 **Response**:
 ```json
 {
  "status": "ok",
-  "elasticsearch": {
-    "cluster": "elasticsearch",
-    "status": "green"
+  "redis": {
+    "version": "7.2.4",
+    "connected": true,
+    "memoryUsed": "1.5M",
+    "uptime": 3600
  },
-  "index": {
-    "exists": true,
-    "name": "hasher",
-    "stats": {
-      "documentCount": 1542,
-      "indexSize": 524288
-    }
+  "database": {
+    "totalKeys": 1542,
+    "documentCount": 386,
+    "totalSize": 524288
  }
 }
 ```

-## 🗄️ Elasticsearch Index
+## 🗄️ Redis Data Structure

-### Index Configuration
+### Key Structures

- **Name**: `hasher`
- **Shards**: 10 (for horizontal scaling)
- **Replicas**: 1 (for redundancy)
+The application uses the following Redis key patterns:

-### Mapping Schema
+1. **Hash Documents**: `hash:plaintext:{plaintext}`
+   ```json
+   {
+     "plaintext": "password",
+     "md5": "5f4dcc3b5aa765d61d8327deb882cf99",
+     "sha1": "5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8",
+     "sha256": "5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8",
+     "sha512": "b109f3bbbc244eb82441917ed06d618b9008dd09b3befd1b5e07394c706a8bb980b1d7785e5976ec049b46df5f1326af5a2ea6d103fd07c95385ffab0cacbc86",
+     "created_at": "2024-01-01T00:00:00.000Z"
+   }
+   ```

-```json
-{
-  "plaintext": {
-    "type": "text",
-    "analyzer": "lowercase_analyzer",
-    "fields": {
-      "keyword": { "type": "keyword" }
-    }
-  },
-  "md5": { "type": "keyword" },
-  "sha1": { "type": "keyword" },
-  "sha256": { "type": "keyword" },
-  "sha512": { "type": "keyword" },
-  "created_at": { "type": "date" }
-}
+2. **Hash Indexes**: `hash:index:{algorithm}:{hash}`
+   - Points to the plaintext value
+   - One index per hash algorithm (md5, sha1, sha256, sha512)
+
+3. **Statistics**: `hash:stats` (Redis Hash)
+   - `count`: Total number of documents
+   - `size`: Total data size in bytes
+
+### Data Flow
+
+```
+Plaintext → Generate Hashes → Store Document
+                ↓
+            Create 4 Indexes (one per algorithm)
+                ↓
+            Update Statistics
 ```

 ## 📁 Project Structure
@@ -233,10 +269,11 @@ hasher/
 │   ├── page.tsx                # Main UI component
 │   └── globals.css             # Global styles
 ├── lib/
-│   ├── elasticsearch.ts        # ES client & index config
+│   ├── redis.ts                # Redis client & operations
 │   └── hash.ts                 # Hash utilities
 ├── scripts/
-│   └── index-file.ts           # Bulk indexing script
+│   ├── index-file.ts           # Bulk indexing script
+│   └── remove-duplicates.ts    # Duplicate removal script
 ├── package.json
 ├── tsconfig.json
 ├── next.config.ts
@@ -257,7 +294,10 @@ npm run start
 Create a `.env.local` file:

 ```env
-ELASTICSEARCH_NODE=http://localhost:9200
+REDIS_HOST=localhost
+REDIS_PORT=6379
+REDIS_PASSWORD=your_password  # Optional
+REDIS_DB=0                     # Optional
 ```

 ### Linting
@@ -277,10 +317,23 @@ npm run lint

 ## 🚀 Performance

- **Bulk Indexing**: ~1000-5000 docs/sec (depending on hardware)
- **Search Latency**: <50ms (typical)
- **Horizontal Scaling**: 10 shards for parallel processing
- **Auto-refresh**: Instant search availability for new documents
+- **Bulk Indexing**: ~5000-15000 docs/sec (depending on hardware)
+- **Search Latency**: <5ms (typical)
+- **Memory Efficient**: In-memory storage with optional persistence
+- **Atomic Operations**: Pipeline-based batch operations
+
+## 🔧 Redis Configuration
+
+For optimal performance, consider these Redis settings:
+
+```conf
+# redis.conf
+maxmemory 2gb
+maxmemory-policy allkeys-lru
+save 900 1
+save 300 10
+save 60 10000
+```

 ## 🤝 Contributing

@@ -299,7 +352,7 @@ This project is open source and available under the [MIT License](LICENSE).
 ## 🙏 Acknowledgments

 - Built with [Next.js](https://nextjs.org/)
- Powered by [Elasticsearch](https://www.elastic.co/)
+- Powered by [Redis](https://redis.io/)
 - Icons by [Lucide](https://lucide.dev/)
 - Styled with [Tailwind CSS](https://tailwindcss.com/)

@@ -310,4 +363,3 @@ For issues, questions, or contributions, please open an issue on GitHub.
 ---

 **Made with ❤️ for the security and development community**
-