Probably the easiest way is to create an array of all your cities in memory (select name from cities
) and then use regex or simple string methods to see if these cities are found in the text.
List<string> cities = GetCitiesFromDatabase(); // need to implement this yourself
string text = @"the text containign city names such as Amsterdam and San Francisco";
bool containsACity = cities.Any(city => text.Contains(city)); //To search case insensitive, add StringComparison.CurrentCultureIgnoreCase
IEnumerable<string> containedCities = cities.Where(city => text.Contains(city));
To ensure that 'Amsterdam' wouldn't match on 'Amsterdamned', you could use a regular expression instead of Contains:
bool containsACity = cities.Any(city => Regex.IsMatch(text, @""+Regex.Escape(city))+@"")
// Add RegexOptions.IgnoreCase for case insensitive matches.
IEnumerable<string> containedCities = cities.Where(city => Regex.IsMatch(text, @""+Regex.Escape(city))+@"");
Alternatively, you can build a large regular expression to search for any city and execute that once:
string regex = @"(?:" + String.Join("|", cities.Select(city => Regex.Escape(city)).ToArray()) + @")"
bool containsACity = Regex.IsMatch(text, regex, RegexOptions.IgnoreCase);
IEnumerable<string> containedCities = Regex.Matches(text, regex, RegexOptions.IgnoreCase).Cast<Match>().Select(m => m.Value);
You can improve the performance of these calls by caching the list of cities or caching the regular expression (and improve even further by creating a static readonly Regex object with RegexOptions.Compiled).
Another solution would be to calculate this in the database, instead of storing a local list of cities in memory, send the input to the database and use a LIKE statement or Regex inside the database to compare the list of cities against the text. Depending on the number of cities and the size of the text this might be a faster solution, but whether or not this is possible depends on the database being used.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…