Question

Я использую принятое решение здесь , чтобы преобразовать лист Excel в таблицу данных.Это прекрасно работает, если у меня есть «идеальные» данные, но если у меня пустая ячейка в середине моих данных, кажется, что в каждый столбец помещаются неправильные данные.

Я думаю, это потому, что в приведенном ниже коде:

row.Descendants<Cell>().Count()

- количество заполненных ячеек (не все столбцы) И:

GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));

, кажется, находит следующую заполненную ячейку (необязательно, что находится в этом индексе), поэтому, если первыйстолбец пуст, и я вызываю ElementAt (0), он возвращает значение во втором столбце.

Вот полный код синтаксического анализа.

DataRow tempRow = dt.NewRow();

for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
{
    tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
    if (tempRow[i].ToString().IndexOf("Latency issues in") > -1)
    {
        Console.Write(tempRow[i].ToString());
    }
}

amurra · Answer 1 · 20 октября 2010

Это имеет смысл, поскольку Excel не будет хранить значение для ячейки, которая является нулевой.Если вы откроете свой файл с помощью инструмента повышения производительности Open XML SDK 2.0 и просмотрите XML до уровня ячеек, вы увидите, что в этом файле будут находиться только те ячейки, в которых есть данные.

Вы можете вставить пустые данные в диапазон ячеек, которые вы собираетесь обойти, или программно выяснить, была ли пропущена ячейка, и соответствующим образом настроить ваш индекс.

Я создал пример документа Excel сстрока в ячейке ссылки A1 и C1.Затем я открыл документ Excel в инструменте повышения производительности Open XML, и вот XML-файл, который был сохранен:

<x:row r="1" spans="1:3" 
   xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <x:c r="A1" t="s">
    <x:v>0</x:v>
  </x:c>
  <x:c r="C1" t="s">
    <x:v>1</x:v>
  </x:c>
</x:row>

Здесь вы увидите, что данные соответствуют первой строке и что только две ячейки стоятданные сохраняются для этой строки.Сохраненные данные соответствуют A1 и C1, и ни одна ячейка с нулевыми значениями не сохраняется.

Чтобы получить необходимую вам функциональность, вы можете перемещаться по ячейкам, как вы делали выше, но вам нужно будет проверить, на какое значение ссылается ячейка, и определить, были ли пропущены какие-либо ячейки.для этого вам понадобятся две служебные функции, чтобы получить имя столбца из ссылки на ячейку и затем преобразовать имя этого столбца в индекс, основанный на нуле:

    private static List<char> Letters = new List<char>() { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', ' ' };

    /// <summary>
    /// Given a cell name, parses the specified cell to get the column name.
    /// </summary>
    /// <param name="cellReference">Address of the cell (ie. B2)</param>
    /// <returns>Column Name (ie. B)</returns>
    public static string GetColumnName(string cellReference)
    {
        // Create a regular expression to match the column name portion of the cell name.
        Regex regex = new Regex("[A-Za-z]+");
        Match match = regex.Match(cellReference);

        return match.Value;
    }

    /// <summary>
    /// Given just the column name (no row index), it will return the zero based column index.
    /// Note: This method will only handle columns with a length of up to two (ie. A to Z and AA to ZZ). 
    /// A length of three can be implemented when needed.
    /// </summary>
    /// <param name="columnName">Column Name (ie. A or AB)</param>
    /// <returns>Zero based index if the conversion was successful; otherwise null</returns>
    public static int? GetColumnIndexFromName(string columnName)
    {
        int? columnIndex = null;

        string[] colLetters = Regex.Split(columnName, "([A-Z]+)");
        colLetters = colLetters.Where(s => !string.IsNullOrEmpty(s)).ToArray();

        if (colLetters.Count() <= 2)
        {
            int index = 0;
            foreach (string col in colLetters)
            {
                List<char> col1 = colLetters.ElementAt(index).ToCharArray().ToList();
                int? indexValue = Letters.IndexOf(col1.ElementAt(index));

                if (indexValue != -1)
                {
                    // The first letter of a two digit column needs some extra calculations
                    if (index == 0 && colLetters.Count() == 2)
                    {
                        columnIndex = columnIndex == null ? (indexValue + 1) * 26 : columnIndex + ((indexValue + 1) * 26);
                    }
                    else
                    {
                        columnIndex = columnIndex == null ? indexValue : columnIndex + indexValue;
                    }
                }

                index++;
            }
        }

        return columnIndex;
    }

Затем вы можете выполнить итерацию по ячейкам и проверитьпосмотрите, что ссылка на ячейку сравнивается с columnIndex.Если оно меньше, чем вы добавляете пустые данные в ваш tempRow, в противном случае просто прочитайте значение, содержащееся в ячейке.(Примечание: я не тестировал приведенный ниже код, но общая идея должна помочь):

DataRow tempRow = dt.NewRow();

int columnIndex = 0;
foreach (Cell cell in row.Descendants<Cell>())
{
   // Gets the column index of the cell with data
   int cellColumnIndex = (int)GetColumnIndexFromName(GetColumnName(cell.CellReference));

   if (columnIndex < cellColumnIndex)
   {
      do
      {
         tempRow[columnIndex] = //Insert blank data here;
         columnIndex++;
      }
      while(columnIndex < cellColumnIndex);
    }
    tempRow[columnIndex] = GetCellValue(spreadSheetDocument, cell);

    if (tempRow[i].ToString().IndexOf("Latency issues in") > -1)
    {
       Console.Write(tempRow[i].ToString());
    }
    columnIndex++;
}

Waylon Flinn · Answer 2 · 12 июля 2011

Вот реализация IEnumerable, которая должна делать то, что вы хотите, скомпилирована и протестирована модулем.

    ///<summary>returns an empty cell when a blank cell is encountered
    ///</summary>
    public IEnumerator<Cell> GetEnumerator()
    {
        int currentCount = 0;

        // row is a class level variable representing the current
        // DocumentFormat.OpenXml.Spreadsheet.Row
        foreach (DocumentFormat.OpenXml.Spreadsheet.Cell cell in
            row.Descendants<DocumentFormat.OpenXml.Spreadsheet.Cell>())
        {
            string columnName = GetColumnName(cell.CellReference);

            int currentColumnIndex = ConvertColumnNameToNumber(columnName);

            for ( ; currentCount < currentColumnIndex; currentCount++)
            {
                yield return new DocumentFormat.OpenXml.Spreadsheet.Cell();
            }

            yield return cell;
            currentCount++;
        }
    }

Вот функции, на которые она опирается:

    /// <summary>
    /// Given a cell name, parses the specified cell to get the column name.
    /// </summary>
    /// <param name="cellReference">Address of the cell (ie. B2)</param>
    /// <returns>Column Name (ie. B)</returns>
    public static string GetColumnName(string cellReference)
    {
        // Match the column name portion of the cell name.
        Regex regex = new Regex("[A-Za-z]+");
        Match match = regex.Match(cellReference);

        return match.Value;
    }

    /// <summary>
    /// Given just the column name (no row index),
    /// it will return the zero based column index.
    /// </summary>
    /// <param name="columnName">Column Name (ie. A or AB)</param>
    /// <returns>Zero based index if the conversion was successful</returns>
    /// <exception cref="ArgumentException">thrown if the given string
    /// contains characters other than uppercase letters</exception>
    public static int ConvertColumnNameToNumber(string columnName)
    {
        Regex alpha = new Regex("^[A-Z]+$");
        if (!alpha.IsMatch(columnName)) throw new ArgumentException();

        char[] colLetters = columnName.ToCharArray();
        Array.Reverse(colLetters);

        int convertedValue = 0;
        for (int i = 0; i < colLetters.Length; i++)
        {
            char letter = colLetters[i];
            int current = i == 0 ? letter - 65 : letter - 64; // ASCII 'A' = 65
            convertedValue += current * (int)Math.Pow(26, i);
        }

        return convertedValue;
    }

Бросьте этов классе и попробуйте.

Tim Schmelter · Answer 3 · 20 марта 2014

Вот слегка измененная версия ответа Уэйлона , в котором также использовались другие ответы.Он инкапсулирует его метод в классе.

Я изменил

IEnumerator<Cell> GetEnumerator()

на

IEnumerable<Cell> GetRowCells(Row row)

Вот класс, вам не нужно создавать его экземпляр, он простослужит служебным классом:

public class SpreedsheetHelper
{
    ///<summary>returns an empty cell when a blank cell is encountered
    ///</summary>
    public static IEnumerable<Cell> GetRowCells(Row row)
    {
        int currentCount = 0;

        foreach (DocumentFormat.OpenXml.Spreadsheet.Cell cell in
            row.Descendants<DocumentFormat.OpenXml.Spreadsheet.Cell>())
        {
            string columnName = GetColumnName(cell.CellReference);

            int currentColumnIndex = ConvertColumnNameToNumber(columnName);

            for (; currentCount < currentColumnIndex; currentCount++)
            {
                yield return new DocumentFormat.OpenXml.Spreadsheet.Cell();
            }

            yield return cell;
            currentCount++;
        }
    }

    /// <summary>
    /// Given a cell name, parses the specified cell to get the column name.
    /// </summary>
    /// <param name="cellReference">Address of the cell (ie. B2)</param>
    /// <returns>Column Name (ie. B)</returns>
    public static string GetColumnName(string cellReference)
    {
        // Match the column name portion of the cell name.
        var regex = new System.Text.RegularExpressions.Regex("[A-Za-z]+");
        var match = regex.Match(cellReference);

        return match.Value;
    }

    /// <summary>
    /// Given just the column name (no row index),
    /// it will return the zero based column index.
    /// </summary>
    /// <param name="columnName">Column Name (ie. A or AB)</param>
    /// <returns>Zero based index if the conversion was successful</returns>
    /// <exception cref="ArgumentException">thrown if the given string
    /// contains characters other than uppercase letters</exception>
    public static int ConvertColumnNameToNumber(string columnName)
    {
        var alpha = new System.Text.RegularExpressions.Regex("^[A-Z]+$");
        if (!alpha.IsMatch(columnName)) throw new ArgumentException();

        char[] colLetters = columnName.ToCharArray();
        Array.Reverse(colLetters);

        int convertedValue = 0;
        for (int i = 0; i < colLetters.Length; i++)
        {
            char letter = colLetters[i];
            int current = i == 0 ? letter - 65 : letter - 64; // ASCII 'A' = 65
            convertedValue += current * (int)Math.Pow(26, i);
        }

        return convertedValue;
    }
}

Теперь вы можете получить ячейки всех строк следующим образом:

// skip the part that retrieves the worksheet sheetData
IEnumerable<Row> rows = sheetData.Descendants<Row>();
foreach(Row row in rows)
{
    IEnumerable<Cell> cells = SpreedsheetHelper.GetRowCells(row);
    foreach (Cell cell in cells)
    {
         // skip part that reads the text according to the cell-type
    }
}

Он будет содержать все ячейки, даже если они пусты.

jaccso · Answer 4 · 24 октября 2012

Смотрите мою реализацию:

  Row[] rows = worksheet.GetFirstChild<SheetData>()
                .Elements<Row>()
                .ToArray();

  string[] columnNames = rows.First()
                .Elements<Cell>()
                .Select(cell => GetCellValue(cell, document))
                .ToArray();

  HeaderLetters = ExcelHeaderHelper.GetHeaderLetters((uint)columnNames.Count());

  if (columnNames.Count() != HeaderLetters.Count())
  {
       throw new ArgumentException("HeaderLetters");
  }

  IEnumerable<List<string>> cellValues = GetCellValues(rows.Skip(1), columnNames.Count(), document);

//Here you can enumerate through the cell values, based on the cell index the column names can be retrieved.

Заголовочные письма собираются с использованием этого класса:

    private static class ExcelHeaderHelper
    {
        public static string[] GetHeaderLetters(uint max)
        {
            var result = new List<string>();
            int i = 0;
            var columnPrefix = new Queue<string>();
            string prefix = null;
            int prevRoundNo = 0;
            uint maxPrefix = max / 26;

            while (i < max)
            {
                int roundNo = i / 26;
                if (prevRoundNo < roundNo)
                {
                    prefix = columnPrefix.Dequeue();
                    prevRoundNo = roundNo;
                }
                string item = prefix + ((char)(65 + (i % 26))).ToString(CultureInfo.InvariantCulture);
                if (i <= maxPrefix)
                {
                    columnPrefix.Enqueue(item);
                }
                result.Add(item);
                i++;
            }
            return result.ToArray();
        }
    }

И вспомогательные методы:

    private static IEnumerable<List<string>> GetCellValues(IEnumerable<Row> rows, int columnCount, SpreadsheetDocument document)
    {
        var result = new List<List<string>>();
        foreach (var row in rows)
        {
            List<string> cellValues = new List<string>();
            var actualCells = row.Elements<Cell>().ToArray();

            int j = 0;
            for (int i = 0; i < columnCount; i++)
            {
                if (actualCells.Count() <= j || !actualCells[j].CellReference.ToString().StartsWith(HeaderLetters[i]))
                {
                    cellValues.Add(null);
                }
                else
                {
                    cellValues.Add(GetCellValue(actualCells[j], document));
                    j++;
                }
            }
            result.Add(cellValues);
        }
        return result;
    }


private static string GetCellValue(Cell cell, SpreadsheetDocument document)
{
    bool sstIndexedcell = GetCellType(cell);
    return sstIndexedcell
        ? GetSharedStringItemById(document.WorkbookPart, Convert.ToInt32(cell.InnerText))
        : cell.InnerText;
}

private static bool GetCellType(Cell cell)
{
    return cell.DataType != null && cell.DataType == CellValues.SharedString;
}

private static string GetSharedStringItemById(WorkbookPart workbookPart, int id)
{
    return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id).InnerText;
}

Решение работает с общими ячейками (индексированные ячейки SST).

howardlo · Answer 5 · 08 ноября 2012

Буквенный код - это кодировка 26, так что это должно работать для преобразования его в смещение.

// Converts letter code (i.e. AA) to an offset
public int offset( string code)
{
    var offset = 0;
    var byte_array = Encoding.ASCII.GetBytes( code ).Reverse().ToArray();
    for( var i = 0; i < byte_array.Length; i++ )
    {
        offset += (byte_array[i] - 65 + 1) * Convert.ToInt32(Math.Pow(26.0, Convert.ToDouble(i)));
    }
    return offset - 1;
}

Geoffrey Malafsky · Answer 6 · 10 сентября 2012

Все хорошие примеры.Вот тот, который я использую, так как мне нужно отслеживать все строки, ячейки, значения и заголовки для корреляции и анализа.

Метод ReadSpreadsheet открывает файл xlxs и просматривает каждый лист, строку и столбец.Так как значения хранятся в ссылочной таблице строк, я также явно использую это для каждого листа.Используются и другие классы: DSFunction и StaticVariables.Последний содержит часто используемые значения параметров, такие как "quotdouble" (quotdouble = "\ u0022";) и crlf (crlf = "\ u000D" + "\ u000A";).

Соответствующий метод DSFunction GetIntColIndexForLetter включен ниже.Он возвращает целочисленное значение для индекса столбца, соответствующего буквенным именам, таким как (A, B, AA, ADE и т. Д.).Это используется вместе с параметром ncellcolref, чтобы определить, были ли пропущены какие-либо столбцы, и ввести пустые строковые значения для каждого пропущенного столбца.

Я также делаю некоторую очистку значений перед временным сохранением в объекте List (используя метод Replace).

Впоследствии я использую хеш-таблицу (Словарь) имен столбцов для извлечения значений из разных таблиц, их корреляции, создания нормализованных значений, а затем создания объекта, используемого в нашем продукте, который затем сохраняется в виде файла XML,Ничего из этого не показано, но именно поэтому этот подход используется.

    public static class DSFunction {

    /// <summary>
    /// Creates an integer value for a column letter name starting at 1 for 'a'
    /// </summary>
    /// <param name="lettstr">Column name as letters</param>
    /// <returns>int value</returns>
    public static int GetIntColIndexForLetter(string lettstr) {
        string txt = "", txt1="";
        int n1, result = 0, nbeg=-1, nitem=0;
        try {
            nbeg = (int)("a".ToCharArray()[0]) - 1; //1 based
            txt = lettstr;
            if (txt != "") txt = txt.ToLower().Trim();
            while (txt != "") {
                if (txt.Length > 1) {
                    txt1 = txt.Substring(0, 1);
                    txt = txt.Substring(1);
                }
                else {
                    txt1 = txt;
                    txt = "";
                }
                if (!DSFunction.IsNumberString(txt1, "real")) {
                    nitem++;
                    n1 = (int)(txt1.ToCharArray()[0]) - nbeg;
                    result += n1 + (nitem - 1) * 26;
                }
                else {
                    break;
                }
            }
        }
        catch (Exception ex) {
            txt = ex.Message;
        }
        return result;
    }


}


    public static class Extractor {

    public static string ReadSpreadsheet(string fileUri) {
        string msg = "", txt = "", txt1 = "";
        int i, n1, n2, nrow = -1, ncell = -1, ncellcolref = -1;
        Boolean haveheader = true;
        Dictionary<string, int> hashcolnames = new Dictionary<string, int>();
        List<string> colvalues = new List<string>();
        try {
            if (!File.Exists(fileUri)) { throw new Exception("file does not exist"); }
            using (SpreadsheetDocument ssdoc = SpreadsheetDocument.Open(fileUri, true)) {
                var stringTable = ssdoc.WorkbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                foreach (Sheet sht in ssdoc.WorkbookPart.Workbook.Descendants<Sheet>()) {
                    nrow = 0;
                    foreach (Row ssrow in ((WorksheetPart)(ssdoc.WorkbookPart.GetPartById(sht.Id))).Worksheet.Descendants<Row>()) {
                        ncell = 0;
                        ncellcolref = 0;
                        nrow++;
                        colvalues.Clear();
                        foreach (Cell sscell in ssrow.Elements<Cell>()) {
                            ncell++;
                            n1 = DSFunction.GetIntColIndexForLetter(sscell.CellReference);
                            for (i = 0; i < (n1 - ncellcolref - 1); i++) {
                                if (nrow == 1 && haveheader) {
                                    txt1 = "-missing" + (ncellcolref + 1 + i).ToString() + "-";
                                    if (!hashcolnames.TryGetValue(txt1, out n2)) {
                                        hashcolnames.Add(txt1, ncell - 1);
                                    }
                                }
                                else {
                                    colvalues.Add("");
                                }
                            }
                            ncellcolref = n1;
                            if (sscell.DataType != null) {
                                if (sscell.DataType.Value == CellValues.SharedString && stringTable != null) {
                                    txt = stringTable.SharedStringTable.ElementAt(int.Parse(sscell.InnerText)).InnerText;
                                }
                                else if (sscell.DataType.Value == CellValues.String) {
                                    txt = sscell.InnerText;
                                }
                                else txt = sscell.InnerText.ToString();
                            }
                            else txt = sscell.InnerText;
                            if (txt != "") txt1 = txt.ToLower().Trim(); else txt1 = "";
                            if (nrow == 1 && haveheader) {
                                txt1 = txt1.Replace(" ", "");
                                if (txt1 == "table/viewname") txt1 = "tablename";
                                else if (txt1 == "schemaownername") txt1 = "schemaowner";
                                else if (txt1 == "subjectareaname") txt1 = "subjectarea";
                                else if (txt1.StartsWith("column")) {
                                    txt1 = txt1.Substring("column".Length);
                                }
                                if (!hashcolnames.TryGetValue(txt1, out n1)) {
                                    hashcolnames.Add(txt1, ncell - 1);
                                }
                            }
                            else {
                                txt = txt.Replace(((char)8220).ToString(), "'");  //special "
                                txt = txt.Replace(((char)8221).ToString(), "'"); //special "
                                txt = txt.Replace(StaticVariables.quotdouble, "'");
                                txt = txt.Replace(StaticVariables.crlf, " ");
                                txt = txt.Replace("  ", " ");
                                txt = txt.Replace("<", "");
                                txt = txt.Replace(">", "");
                                colvalues.Add(txt);
                            }
                        }
                    }
                }
            }
        }
        catch (Exception ex) {
            msg = "notok:" + ex.Message;
        }
        return msg;
    }





}

Mike Gledhill · Answer 7 · 08 мая 2017

С извинениями за публикацию еще одного ответа на этот вопрос, вот код, который я использовал.

У меня были проблемы с тем, что OpenXML не работал должным образом, если на рабочем листе сверху была пустая строка. Иногда он просто возвращает DataTable с 0 строками и 0 столбцами. Код ниже справляется с этим и со всеми другими рабочими листами.

Вот как бы вы назвали мой код. Просто введите имя файла и имя листа для чтения:

DataTable dt = OpenXMLHelper.ExcelWorksheetToDataTable("C:\\SQL Server\\SomeExcelFile.xlsx", "Mikes Worksheet");

А вот и сам код:

    public class OpenXMLHelper
    {
        //  A helper function to open an Excel file using OpenXML, and return a DataTable containing all the data from one
        //  of the worksheets.
        //
        //  We've had lots of problems reading in Excel data using OLEDB (eg the ACE drivers no longer being present on new servers,
        //  OLEDB not working due to security issues, and blatantly ignoring blank rows at the top of worksheets), so this is a more 
        //  stable method of reading in the data.
        //
        public static DataTable ExcelWorksheetToDataTable(string pathFilename, string worksheetName)
        {
            DataTable dt = new DataTable(worksheetName);

            using (SpreadsheetDocument document = SpreadsheetDocument.Open(pathFilename, false))
            {
                // Find the sheet with the supplied name, and then use that 
                // Sheet object to retrieve a reference to the first worksheet.
                Sheet theSheet = document.WorkbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == worksheetName).FirstOrDefault();
                if (theSheet == null)
                    throw new Exception("Couldn't find the worksheet: " + worksheetName);

                // Retrieve a reference to the worksheet part.
                WorksheetPart wsPart = (WorksheetPart)(document.WorkbookPart.GetPartById(theSheet.Id));
                Worksheet workSheet = wsPart.Worksheet;

                string dimensions = workSheet.SheetDimension.Reference.InnerText;       //  Get the dimensions of this worksheet, eg "B2:F4"

                int numOfColumns = 0;
                int numOfRows = 0;
                CalculateDataTableSize(dimensions, ref numOfColumns, ref numOfRows);
                System.Diagnostics.Trace.WriteLine(string.Format("The worksheet \"{0}\" has dimensions \"{1}\", so we need a DataTable of size {2}x{3}.", worksheetName, dimensions, numOfColumns, numOfRows));

                SheetData sheetData = workSheet.GetFirstChild<SheetData>();
                IEnumerable<Row> rows = sheetData.Descendants<Row>();

                string[,] cellValues = new string[numOfColumns, numOfRows];

                int colInx = 0;
                int rowInx = 0;
                string value = "";
                SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;

                //  Iterate through each row of OpenXML data, and store each cell's value in the appropriate slot in our [,] string array.
                foreach (Row row in rows)
                {
                    for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
                    {
                        //  *DON'T* assume there's going to be one XML element for each column in each row...
                        Cell cell = row.Descendants<Cell>().ElementAt(i);
                        if (cell.CellValue == null || cell.CellReference == null)
                            continue;                       //  eg when an Excel cell contains a blank string

                        //  Convert this Excel cell's CellAddress into a 0-based offset into our array (eg "G13" -> [6, 12])
                        colInx = GetColumnIndexByName(cell.CellReference);             //  eg "C" -> 2  (0-based)
                        rowInx = GetRowIndexFromCellAddress(cell.CellReference)-1;     //  Needs to be 0-based

                        //  Fetch the value in this cell
                        value = cell.CellValue.InnerXml;
                        if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                        {
                            value = stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
                        }

                        cellValues[colInx, rowInx] = value;
                    }
                }

                //  Copy the array of strings into a DataTable.
                //  We don't (currently) make any attempt to work out which columns should be numeric, rather than string.
                for (int col = 0; col < numOfColumns; col++)
                    dt.Columns.Add("Column_" + col.ToString());

                for (int row = 0; row < numOfRows; row++)
                {
                    DataRow dataRow = dt.NewRow();
                    for (int col = 0; col < numOfColumns; col++)
                    {
                        dataRow.SetField(col, cellValues[col, row]);
                    }
                    dt.Rows.Add(dataRow);
                }

#if DEBUG
                //  Write out the contents of our DataTable to the Output window (for debugging)
                string str = "";
                for (rowInx = 0; rowInx < maxNumOfRows; rowInx++)
                {
                    for (colInx = 0; colInx < maxNumOfColumns; colInx++)
                    {
                        object val = dt.Rows[rowInx].ItemArray[colInx];
                        str += (val == null) ? "" : val.ToString();
                        str += "\t";
                    }
                    str += "\n";
                }
                System.Diagnostics.Trace.WriteLine(str);
#endif
                return dt;
            }
        }

        private static void CalculateDataTableSize(string dimensions, ref int numOfColumns, ref int numOfRows)
        {
            //  How many columns & rows of data does this Worksheet contain ?  
            //  We'll read in the Dimensions string from the Excel file, and calculate the size based on that.
            //      eg "B1:F4" -> we'll need 6 columns and 4 rows.
            //
            //  (We deliberately ignore the top-left cell address, and just use the bottom-right cell address.)
            try
            {
                string[] parts = dimensions.Split(':');     // eg "B1:F4" 
                if (parts.Length != 2)
                    throw new Exception("Couldn't find exactly *two* CellAddresses in the dimension");

                numOfColumns = 1 + GetColumnIndexByName(parts[1]);     //  A=1, B=2, C=3  (1-based value), so F4 would return 6 columns
                numOfRows = GetRowIndexFromCellAddress(parts[1]);
            }
            catch
            {
                throw new Exception("Could not calculate maximum DataTable size from the worksheet dimension: " + dimensions);
            }
        }

        public static int GetRowIndexFromCellAddress(string cellAddress)
        {
            //  Convert an Excel CellReference column into a 1-based row index
            //  eg "D42"  ->  42
            //     "F123" ->  123
            string rowNumber = System.Text.RegularExpressions.Regex.Replace(cellAddress, "[^0-9 _]", "");
            return int.Parse(rowNumber);
        }

        public static int GetColumnIndexByName(string cellAddress)
        {
            //  Convert an Excel CellReference column into a 0-based column index
            //  eg "D42" ->  3
            //     "F123" -> 5
            var columnName = System.Text.RegularExpressions.Regex.Replace(cellAddress, "[^A-Z_]", "");
            int number = 0, pow = 1;
            for (int i = columnName.Length - 1; i >= 0; i--)
            {
                number += (columnName[i] - 'A' + 1) * pow;
                pow *= 26;
            }
            return number - 1;
        }
    }

Renzo Ciot · Answer 8 · 22 января 2013

Вы можете использовать эту функцию, чтобы извлечь ячейку из строки, передающей индекс заголовка:

public static Cell GetCellFromRow(Row r ,int headerIdx) {
        string cellname = GetNthColumnName(headerIdx) + r.RowIndex.ToString();
        IEnumerable<Cell> cells = r.Elements<Cell>().Where(x=> x.CellReference == cellname);
        if (cells.Count() > 0)
        {
            return cells.First();
        }
        else {
            return null;
        }
}
public static string GetNthColumnName(int n)
    {
        string name = "";
        while (n > 0)
        {
            n--;
            name = (char)('A' + n % 26) + name;
            n /= 26;
        }
        return name;
    }

Owain Reed · Answer 9 · 29 августа 2012

Хорошо, я не совсем эксперт в этом вопросе, но другие ответы мне кажутся чрезмерными, поэтому вот мое решение:

// Loop through each row in the spreadsheet, skipping the header row
foreach (var row in sheetData.Elements<Row>().Skip(1))
{
    var i = 0;
    string[] letters = new string[15] {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O" };

    List<String> cellsList = new List<string>();
    foreach (var cell in row.Elements<Cell>().ToArray())
    {
        while (cell.CellReference.ToString()[0] != Convert.ToChar(letters[i]))
        {//accounts for multiple consecutive blank cells
            cellsList.Add("");
            i++;
        }
        cellsList.Add(cell.CellValue.Text);
        i++;
    }

    string[] cells = cellsList.ToArray();

    foreach(var cell in cellsList)
    {
        //display contents of cell, depending on the datatype you may need to call each of the cells manually
    }
}

Надеюсь, кто-то найдет это полезным!

teatime · Answer 10 · 04 сентября 2017

Добавлена еще одна реализация, на этот раз, где количество столбцов известно заранее:

        /// <summary>
        /// Gets a list cells that are padded with empty cells where necessary.
        /// </summary>
        /// <param name="numberOfColumns">The number of columns expected.</param>
        /// <param name="cells">The cells.</param>
        /// <returns>List of padded cells</returns>
        private static IList<Cell> GetPaddedCells(int numberOfColumns, IList<Cell> cells)
        {
            // Only perform the padding operation if existing column count is less than required
            if (cells.Count < numberOfColumns - 1)
            {
                IList<Cell> padded = new List<Cell>();
                int cellIndex = 0;

                for (int paddedIndex = 0; paddedIndex < numberOfColumns; paddedIndex++)
                {
                    if (cellIndex < cells.Count)
                    {
                        // Grab column reference (ignore row) <seealso cref="https://stackoverflow.com/a/7316298/674776"/>
                        string columnReference = new string(cells[cellIndex].CellReference.ToString().Where(char.IsLetter).ToArray());

                        // Convert reference to index <seealso cref="https://stackoverflow.com/a/848552/674776"/>
                        int indexOfReference = columnReference.ToUpper().Aggregate(0, (column, letter) => (26 * column) + letter - 'A' + 1) - 1;

                        // Add padding cells where current cell index is less than required
                        while (indexOfReference > paddedIndex)
                        {
                            padded.Add(new Cell());
                            paddedIndex++;
                        }

                        padded.Add(cells[cellIndex++]);
                    }
                    else
                    {
                        // Add padding cells when passed existing cells
                        padded.Add(new Cell());
                    }
                }

                return padded;
            }
            else
            {
                return cells;
            }
        }

Вызов с помощью:

IList<Cell> cells = GetPaddedCells(38, row.Descendants<Cell>().ToList());

Где 38 - требуемое количество столбцов.

чтение Excel Open XML игнорирует пустые ячейки

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 13 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

чтение Excel Open XML игнорирует пустые ячейки

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 13 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Похожие темы