Я хочу раскрасить каждый символ условно в каждой ячейке столбца.Я могу себе представить, что делать частично (новичок в R):
1. open xlsx table or txt and change it to xlsx
2. iterate through column (threat cell as vector)
3. iterate through each vector (characters) and change color conditionally
(and through regex find lines which will be colored - sequences)
4. save to xlsx
Но я не знаю, как раскрасить элементы в xlsx (и какой lib) и как сохранить файл с этим изменением.
Пример данных
>>f_2;hypothetical protein L_2128 [Legionella] {gene:L_2128}_start=1;end=300;length=300;source_length=320
LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCTLINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKASKARIHVADIEKIVGDKISYKNKNVISG
&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 | -0.368E+01 >>_vfdb.0002001_ VFG001328(gi:21283614) (sak) Staphylokinase precursor [Staphylokinase (VF0021)] [Staphylococcus aureus subsp. aureus MW2] :_: Length: 163
ss1 HHHHHHHHHEEEETTCCCCCCCHHHEEEHTTTTTTTH-HHHHHHEEEEHHHHHHTHEEEEEECTCCCCHHHEEECCCCCCTTEHHHHHHHHHHH
#1 LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCT-LINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKAS
#c ----------------------+---------------+----+-----------+------+-+--+-----------+--------------
#2 VEFPIKPGTTLTKEK--IEYYVEWALDATAYKEFRVVELDTSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVIEKK
ss2 EEETTCCTCCCHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHEEHHHHHHHHHHHHHHCHHHTTTEECHHHHHTTCCTTTCEEEHHHHHHH
pseudoscore: 8.51
1st sequence starts at 1
2nd sequence starts at 72
>>f_1; hypothetical protein L_2128 [Legionella] {gene:L_2128}_start=201;end=320;length=120;source_length=320
LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCTLINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKASKARIHVADIEKIVGDKISYKNKNVISG
&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 | -0.368E+01 >>_vfdb.0002001_ VFG001328(gi:21283614) (sak) Staphylokinase precursor [Staphylokinase (VF0021)] [Staphylococcus aureus subsp. aureus MW2] :_: Length: 163
ss1 HHHHHHHHHEEEETTCCCCCCCHHHEEEHTTTTTTTH-HHHHHHEEEEHHHHHHTHEEEEEECTCCCCHHHEEECCCCCCTTEHHHHHHHHHHH
#1 LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCT-LINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKAS
#c ----------------------+---------------+----+-----------+------+-+--+-----------+--------------
#2 VEFPIKPGTTLTKEK--IEYYVEWALDATAYKEFRVVELDTSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVIEKK
ss2 EEETTCCTCCCHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHEEHHHHHHHHHHHHHHCHHHTTTEECHHHHHTTCCTTTCEEEHHHHHHH
pseudoscore: 8.51
1st sequence starts at 1
2nd sequence starts at 72
Мой код:
# xlsx files
setwd('D:/Dropbox/color_ffas_results')
library(xlsx)
wb <- loadWorkbook("sample.xlsx")
sheet1 <- getSheets(wb)[[1]]
# get all rows
rows <- getRows(sheet1)
cells <- getCells(rows)
# look at the values
sapply(cells, getCellValue)
cellColor <- function(style) {
SET COLOR HERE
}
#sequence_pattern <- str_detect("^#\d .*\n")