summaryrefslogtreecommitdiff
path: root/src/corefx
AgeCommit message (Collapse)AuthorFilesLines
2016-10-26Remove dependency of System.Globalization.Native.so on specific ICU version ↵Jan Vorlicek15-33/+499
(#7773) Remove dependency of System.Globalization.Native.so on specific ICU version
2016-10-14Enable RegionInfo netstandard 1.7 APIs (#7604)Tarek Mahmoud Sayed3-4/+77
* Enable RegionInfo netstandard 1.7 APIs * Fix the typo * lowercase TRUE and FALSE
2016-05-30Fix ucol_setMaxVariable detection for Gentoo Linux (#5309)Peter Jas1-0/+7
The issue was the symbol is exported by the ICU lib. Including headers was not enough. The linker requires the libraries to succeed. With this fix, CoreCLR successfully builds on Gentoo Linux 100%. Tested with LXC gentoo container on Ubuntu machine. Steps to configure and build: https://gist.github.com/jasonwilliams200OK/1a2e2c0e904ffa95faf6333fcd88d9b8 Fix #5160
2016-05-19Ignore empty collation elements in EndsWithMatt Ellis1-29/+41
We should ignore empty collaction elements at the end of the string when doing our EndsWith checks. This means the match ICU finds might not span to the end of string, but the only elements after the match before the end are completely ignorable. U+00AD (SOFT HYPHEN) is one such case where the codepoint is completely ignorable. Fixes dotnet/corefx#3467
2016-04-25Change how we detect the default localeMatt Ellis3-3/+49
Previously, we would just ask ICU what it thought the default locale was, since that seemed like a reasonable thing to do. However, in cases where LANG, LC_MESSAGES and LC_ALL where unset and setlocale(3) returned "C", ICU would use "en-US-POSIX" as a default locale. The above case is actually what happens by default when you are running in docker and en-US-POSIX has very odd collation rules (ASCII characters which differ only by case are still treated as seperate letters) which trip folks up. So in this case, we'll use Invariant. If setlocale(3) returns a non C/POSIX locale or any of LANG, LC_MESSAGES, or LC_ALL are set to non empty values, we'll continue to let ICU figure out what to do.
2016-04-18enable build of cross target components.Rahul Kumar2-1/+3
Currently only enabled for arm64
2016-04-06Avoid assert on buffer overflowSteve Harter1-1/+1
2016-03-22Strip symbols on release builds into separate binariesMike McLaughlin1-1/+2
Issue #3669 Created a common cmake strip_symbols function that all the modules and programs use to strip the symbols out of the main into a separate .dbg (Linux) or .dSYM (OSX) file. Added an install_clr cmake function to encapsulate the install logic. Changed all the library module cmake install lines from a TARGETS to a FILES one. The TARGETS based install directives caused cmake to relink the binary and copy the unstripped version to the install path. Left the all programs like corerun or ildasm as TARGETS installs because on OSX FILES type installs don't get marked as executable. Need to use "get_property(strip_source_file TARGET ${targetName} PROPERTY LOCATION)" for the older versions of cmake and "set(strip_source_file $<TARGET_FILE:${targetName}>)" on newer versions (v3 or greater).
2016-02-08Revert "Add un-prefixed signatures as temporary workaround"Jan Kotas2-36/+0
This reverts commit fb80bad2ed19970472ddefe539520abef42a52d0.
2016-02-02TimeZoneInfo.DisplayName values are not localized on LinuxEric Erhardt3-45/+84
Fixed by calling ICU's ucal_getTimeZoneDisplayName to read the display names for the current locale. Fix https://github.com/dotnet/corefx/issues/2748
2016-01-30Revert "Revert "Add un-prefixed signatures as temporary workaround""Jan Kotas2-0/+36
2016-01-30Revert "Add un-prefixed signatures as temporary workaround"Jan Kotas2-36/+0
This reverts commit 6b1d2938ec4a5a2c64fd849797ec7800ed3ab575.
2016-01-29Add un-prefixed signatures as temporary workaroundJan Kotas2-0/+36
2016-01-29Unique names for GlobalizationNative exportsJan Kotas10-166/+168
For consistency and to enable eventual sharing of the same code with CoreRT, I have changed the naming convention for System.Globalization.Native exports to match dotnet/corefx#4818.
2016-01-27Update license headersdotnet-bot11-44/+33
2016-01-13IdnMapping GetUnicode conformance tests failEric Erhardt1-1/+2
The IdnaConformanceTests fail on Unix because \u00DF, \u200C and \u200D characters are not being handled as specified in the http://www.unicode.org/Public/idna/6.0.0/IdnaTest.txt file. The fix is to use UIDNA_NONTRANSITIONAL_TO_UNICODE and UIDNA_CHECK_CONTEXTJ options when calling uidna_openUTS46. Partial fix for https://github.com/dotnet/corefx/issues/3406.
2016-01-11CompareOptions.IgnoreSymbols only ignores punctuation on Unix, but not other ↵Eric Erhardt4-2/+33
symbols By default, ICU alternate shifted collation handling only ignores punctuation, not all symbols, so change the "variable top" to include all symbols and currency characters. Fix #4907
2016-01-08Convert System.Globalization.Native to use a configure.cmake and .h.in files.Eric Erhardt4-17/+17
2015-12-24GC OS interface refactoringJan Vorlicek1-0/+1
This change replaces all calls of OS specific functions in the GC by a call to a platform agnostic interface. Critical sections were abstracted too. The logging file access was changed to use CRT functions instead of Windows specific APIs. A "size" member was added to the card_table_info so that we can pass the right size to the VirtualRelease method when destroying the card table. I have also fixed a bug in the gc_heap::make_card_table error path where when VirtualCommit failed, it called VirtualRelease with size that was not the reserved size, but the committed size. Other related changes - All interlocked operations moved to Interlocked class as static methods - Removed unused function prototypes - Shuffled stuff in the root CMakeLists.txt to enable building the GC sample using the settings inherited from the root CMakeLists.txt and to clean up some things that have rotted over time, like the FEATURE_xxx macros not being in one alphabetically ordered block - Fixed the VOLATILE_MEMORY_BARRIER macro in the gcenv.base.h - Replaced uint32_t thread id by EEThreadId - Removed thread handles storage (g_gc_thread) from the GC. The thread handle is closed right after the thread is launched. That allowed me to get rid of the GCThreadHandle - Renamed the methods of the EEThreadId to be easier to understand - Moved the gcenv.windows.cpp and gcenv.unix.cpp to the sample folder
2015-12-09Fixing collation for the following scenarios:Eric Erhardt1-29/+56
1. When IgnoreSymbols is true, ensure we still ignore half and fullwidth characters that are symbols. 2. Hiragana-Katakana characters differ at the tertiary strength, fixing the rule. 3. Fix collation on OSX which uses ICU 55.1. ICU 55 doesn't support having certain unicode characters using primary '<' rules. These characters are not necessary in the rules, since Windows always treats them the same. Removing 0x3099 and 0x309A from the half/full width rules.
2015-12-07Address PR feedback.Eric Erhardt1-21/+28
2015-12-07Ensure the collator map is thread safe on SortHandle.Eric Erhardt1-0/+9
2015-12-07Adding support for String CompareOptions IgnoreKanaType and IgnoreWidth on Unix.Eric Erhardt1-3/+160
2015-12-07Adding support for String CompareOptions IgnoreNonSpace and IgnoreSymbols.Eric Erhardt1-5/+33
2015-12-07Convert SortHandle to use a std::map to cache UCollators per option that is ↵Eric Erhardt1-19/+36
passed in. This is in preparation of creating different UCollators for each option.
2015-11-20Merge pull request #2112 from ellismg/cache-ucollatorsStephen Toub1-31/+78
Cache UCollators in a Locale
2015-11-20Cache UCollators in CompareInfoMatt Ellis1-31/+78
Creating a UCollator is an expensive operation and we are presently doing it on ever collation operation. We can improve this by caching the UCollators we use for collation on the CompareInfo object itself. This change introduces a new method GetSortHandle which gives back an opaque wrapper which can be used in collation operations instead of a culture name. Internally we represent this is a struct holding the two types of UCollators we care about (if we add additional collators per locale with different options to handle other types of CompareOption flags, we can cache these as well). Collation methods can get a `const UCollator*` reference from the sort handle which is safe to share across threads (per the ICU Design Guidelines[1]). Unfortunately, tracking the lifetime of the SortHandle itself is not as straightfoward as I would like. Right now, we use a SafeHandle to wrap the internal handle and rely on the finalizer of the class to clean up the native resources. However this means that the following code sample will create two finalizable objects: ```csharp var c1 = new CultureInfo("en-US").CompareInfo; var c2 = new CultureInfo("en-US").CompareInfo; ``` If this ends up being an issue, we could explore an approach where we keep a cahce of SortHandles in managed code and pass out references to that SortHandle which would let us share a single SortHandle for a given locale across more than one CompareInfo object. Wins are seeing in places where we previously did lots of string comparisions in a tight loop (for example: dotnet/corefx#3811) moving these operations down to ~6ms per iteration vs ~330ms on my local machine. [1]: http://userguide.icu-project.org/design
2015-11-19Merge pull request #2047 from eerhardt/ShortDateEric Erhardt1-5/+4
DateTimeFormat.ShortDatePattern should use CLDR 'short' format on Unix.
2015-11-15Add USEARCH_DONE check to StartsWith in System.Globalization.Nativestephentoub1-22/+24
The StartsWith ICU wrapper was not checking the result of usearch_first to see if it was USEARCH_DONE, indicating no match found. This has two ramifications: 1. When there isn't a match, USEARCH_DONE (-1) gets passed in as the textLength argument to ucol_openElements, which treats -1 as meaning the string isn't null-terminated, and thus ends up walking the string looking for non-ignorable collation elements. Our tests have been passing because they've been using strings containing only non-ignorable elements, and thus the first character checked causes us to bail and correctly return false. If nothing else, this is an unnecessary perf overhead. 2. But on top of that if there are only ignorable collation elements before the first null character in the string (e.g. if the string begins with a null character), then because we told ICU that the string ended at the first null character, it'll stop walking the string and return a match. e.g. "\0bug".StartsWith("test") returns true incorrectly. This commit simply adds a check for USEARCH_DONE to StartsWith. EndsWith already has such a check.
2015-11-13DateTimeFormat.ShortDatePattern should use CLDR 'short' format on Unix.Eric Erhardt1-5/+4
The DateTimeFormat.ShortDatePattern is currently defaulting to using CLDR's 'yMd' skeleton. However, this value doesn't produce the best format for all cultures, ex. "de-DE". LongDatePattern uses CLDR's 'full' format. To be symmetrical, the ShortDatePattern should be using CLDR's 'short' format. Fix https://github.com/dotnet/coreclr/issues/1736.
2015-11-13Pass target string lengths to ICU on Unixstephentoub1-8/+8
Our current ICU shims for StartsWith, EndsWith, IndexOf, and LastIndexOf take the length of the source string but not the length of the target string. This forces ICU to compute the length of the string by searching for a null terminator. We can save those costs and be more accurate around nulls in the target string by passing the known length in.
2015-10-25Merge pull request #1851 from ellismg/icu-remove-c-plus-plusMatt Ellis7-502/+676
Remove OSX Homebrew ICU dependency
2015-10-23Cleanup CMakeLists.txtMatt Ellis2-20/+27
There were a few problems that needed to be addressed: - Our detection logic around testing if ICU supported a feature was still checking for C++ stuff instead of the coresponding C code (which we ended up using). - There was some cleanup we could do now that the OSX and other Unix builds were split apart
2015-10-23Fix spelling issuesMatt Ellis1-2/+2
2015-10-23Use correct close function for UNumberFormatMatt Ellis1-1/+1
2015-10-22Use std::vector instead of callocMatt Ellis1-18/+6
This matches what we do in other places in calendarData.cpp, the RAII pattern will make it easier to not leak memory.
2015-10-22Link against libicucore on OSXMatt Ellis1-14/+31
OSX ships with a copy of ICU (their globalization APIs are built on top of it). Since we only use stable c based APIs, we can link against it using the methods described in using a System ICU in the ICU User's Guide (basically we disable function renaming, don't use C++ and only use stable APIs). The ICU headers are not part of the SDK, so we continue to need ICU installed via Homebrew as a build time dependency. Fixes dotnet/corefx#3849
2015-10-22Update IndexOfOrdinalIgnoreCase to use full code unitsstephentoub1-62/+57
2015-10-22Improve string.{Last}IndexOf perf on Unix for Ordinal/OrdinalIgnoreCasestephentoub1-0/+78
Our current implementation of IndexOfOrdinal for strings on Unix uses Substring to get the piece of the source string we care about; this results in an unnecessary allocation / string copy. When using OrdinalIgnoreCase, we also convert both the source and search strings to upper-case using ToUpperInvariant, resulting in more allocations. And our LastIndexOfOrdinal implementation delegates to IndexOfOrdinal repeatedly, incurring such allocations potentially multiple times. This change reimplements Ordinal searching in managed code to not use Substring, and it implements OrdinalIgnoreCase searching via new functions exposed in the native globalization shim, so as to use ICU without having to make managed/native transitions for each character. With the changes, {Last}IndexOf with Ordinal/OrdinalIgnoreCase are now allocateion-free (as you'd expect), and throughput when startIndex/count and/or OrdinalIgnoreCase are used is increased significantly, on my machine anywhere from 20% to 3x, depending on the inputs.
2015-10-21Hygine cleanups in holders.hMatt Ellis1-6/+2
Use "= delete" syntax to make it clear the IcuHolder copy constructor and assignment opperators are removed. Remove superfluous "public" modifier on the struct closers used by the IcuHolders.
2015-10-21Cleanup include directivesMatt Ellis5-28/+9
2015-10-21Remove use of icu::Locale C++ typeMatt Ellis5-128/+179
Remove all the uses of the icu::Locale type in favor of just using a char* which is the raw locale id (which is exactly what all the ICU C apis use for a locale). The meat of this change si in locale.cpp to actually handle doing the conversion from UChar* to char*. The rest of the places are dealing with the fallout (GetLocale now has a different signiture and the .getName() dance is no longer needed as we have a raw locale name all the time now).
2015-10-21Move off Locale instance methodsMatt Ellis3-93/+138
To prepare for removing icu::Locale in favor of just using the id directly, remove all the uses of Locale methods except for .getName(). We now use GetLocale to create a Locale but then turn it into a char* for all the helper methods. After this change, we can update GetLocale to do locale parsing into a char buffer and remove all the locale.getName() calls with just `locale'.
2015-10-21Remove C++ Locale use in EnumSymbolsMatt Ellis1-3/+4
2015-10-21Get Eras using ICU C API instead of C++Matt Ellis2-49/+93
Getting the regular eras is straight forward, we can do the thing we do for other locale data and just ask ICU using a specific UDateFormatSymbolType. For abbreviated eras, there's no C API, but we can try to just read the data from ICU resources and fall back to the standard width eras if that doesn't work.
2015-10-21Remove use of ICU C++ DateFormatSymbolsMatt Ellis1-58/+49
2015-10-21Remove useage of ICU C++ DecimalFormatSymbolsMatt Ellis1-40/+24
2015-10-21Remove ICU C++ LocaleDisplayNamesMatt Ellis2-6/+14
2015-10-21Remove use of ICU C++ NumberFormat classMatt Ellis2-65/+82
This change removes NumberFormat in favor of UNumberFormat. There is a bit of work that needs to happen in order to keep the normalization code we use to convert an ICU pattern so to something we can match against working. Instead of UnicodeStrings, the input to the normalization function is now a UChar* and we build up a std::string during normalization. This allows us to also skip a conversion from UChar* back to char* so we can find the correct pattern in our collection of patterns to examine.
2015-10-21Remove use of ICU C++ DateFormatSymbolsMatt Ellis1-14/+4