Стэнфорд-НЛП КБП корпус тренинг - PullRequest
0 голосов
/ 05 марта 2019

Я пытаюсь научиться создавать и обучать корпус для извлечения отношений.Я узнал, что мне требуется корпус в формате conll.Тем не менее, я не знаю, как я должен тренировать корпус.

Вот код, который я должен распечатать пример текста в формате conll.Я не уверен, как я мог бы затем изменить этот файл с соответствующими изменениями, а затем тренироваться с ним.

Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,depparse,coref,natlog,sentiment,kbp,quote");
    props.setProperty("coref.algorithm", "neural");

    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    String text = "The modern definition of artificial intelligence (or AI) is \"the study and design of intelligent agents\" where an intelligent agent is a system that perceives its environment and takes actions which maximizes its chances of success. " + 

            "John McCarthy, who coined the term in 1956, defines it as \"the science and engineering of making intelligent machines. " +

            "Other names for the field have been proposed, such as computational intelligence, synthetic intelligence or computational rationality. " + 

            "The term artificial intelligence is also used to describe a property of machines or programs: the intelligence that the system demonstrates. " + 

            "AI research uses tools and insights from many fields, including computer science, psychology, philosophy, neuroscience, cognitive science, linguistics, operations research, economics, control theory, probability, optimization and logic. " + 

            "AI research also overlaps with tasks such as robotics, control systems, scheduling, data mining, logistics, speech recognition, facial recognition and many others. " + 

            "Computational intelligence Computational intelligence involves iterative development or learning (e.g., parameter tuning in connectionist systems). " + 

            "Learning is based on empirical data and is associated with non-symbolic AI, scruffy AI and soft computing. " + 

            "Subjects in computational intelligence as defined by IEEE Computational Intelligence Society mainly include: Neural networks: trainable systems with very strong pattern recognition capabilities. " + 

            "Fuzzy systems: techniques for reasoning under uncertainty, have been widely used in modern industrial and consumer product control systems; capable of working with concepts such as 'hot', 'cold', 'warm' and 'boiling'. " + 

            "Evolutionary computation: applies biologically inspired concepts such as populations, mutation and survival of the fittest to generate increasingly better solutions to the problem. " + 

            "These methods most notably divide into evolutionary algorithms (e.g., genetic algorithms) and swarm intelligence (e.g., ant algorithms). " + 

            "With hybrid intelligent systems, attempts are made to combine these two groups. " + 

            "Expert inference rules can be generated through neural network or production rules from statistical learning such as in ACT-R or CLARION. " + 

            "It is thought that the human brain uses multiple techniques to both formulate and cross-check results. " + 

            "Thus, systems integration is seen as promising and perhaps necessary for true AI, especially the integration of symbolic and connectionist models. ";


    // Annotate an example document.
    //CoreDocument doc = new CoreDocument(text); 

        //pipeline.annotate(doc);

        String outputFile = "ConnllTest1.txt";
        OutputStream stream;
        try {
            stream = new FileOutputStream(outputFile);
            Writer w = new BufferedWriter( new OutputStreamWriter(stream));
            pipeline.conllPrint(pipeline.process(text), w);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
...