A Dictionary will do the job. However, if you are doing rapid partial matches (e.g. search as the user types) you may get better performance by creating multiple keys which point to the same item. For example, the word "Apple" could be located with "Ap", "App", "Appl", and "Apple".
I have used this approach on a similar number of records with very good results. I have turned my 10K source items into about 50K unique keys. Each of these Dictionary entries points to a list containing references to all matches for that term. You can then search this much smaller list more efficiently. Despite the large number of lists this creates, the memory footprint is quite reasonable.
You can also make up your own keys if desired to redirect common misspellings or point to related items. This also eliminates most of the issues with unique keys because each key points to a list. A single item may be classified by each of the words in its name; this is extremely useful if you have long product names with multiple words in it. When classifying your items, each word in the name can be mapped to one or more keys.
I should also point out that building and classifying 10K items shouldn't take long if done correctly (couple hundred milliseconds is reasonable). The results can be cached for as long as you want using Application
, Cache
, or static members.
To summarize, the resulting structure is a Dictionary<string, List<T>>
where the string is a short (2-6 characters works well) but unique key. Each key points to a List<T>
(or other collection, if you are so inclined) of items which match that key. When a search is performed, you locate the key which matches the term provided by the user. Depending on the length of your keys, you may truncate the user's search to your maximum key length. After locating the correct child collection, you then search that collection for a complete or partial match using whatever methodology you wish.
Lastly, you may wish to create a lightweight structure for each item in the list so that you can store additional information about the item. For example, you might create a small Product class which stores the name, price, department, and popularity of the product. This can help you refine the results you show to the user.
All-in-all, you can perform intelligent, detailed, fuzzy searches in real-time.
The aforementioned structures should provide functionality roughly equivalent to a trie.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…