Automated Slogan Production Using a Genetic Algorithm

Similar documents
Feature Engineering. Knowledge Discovery and Data Mining 1. Roman Kern. ISDS, TU Graz

Spatial Role Labeling CS365 Course Project

TE4-2 30th Fuzzy System Symposium(Kochi,September 1-3,2014) Advertising Slogan Selection System Using Review Comments on the Web. 2 Masafumi Hagiwara

10/17/04. Today s Main Points

Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University

FROM QUERIES TO TOP-K RESULTS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Extracting emotion from twitter

STRING: Protein association networks. Lars Juhl Jensen

Hidden Markov Models

CLRG Biocreative V

B u i l d i n g a n d E x p l o r i n g

Computer science research seminar: VideoLectures.Net recommender system challenge: presentation of baseline solution

Genetic Algorithm. Outline

Be principled! A Probabilistic Model for Lexical Entailment

Lecture 15: Genetic Algorithms

Natural Language Processing SoSe Words and Language Model

Fun with weighted FSTs

An Introduction to Bioinformatics Algorithms Hidden Markov Models

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Sample Items. Grade 3. Copyright Georgia Department of Education. All rights reserved.

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer

MASSACHUSETTS INSTITUTE OF TECHNOLOGY /ESD.77 MSDO /ESD.77J Multidisciplinary System Design Optimization (MSDO) Spring 2010.

Similarity Measures for Query Expansion in TopX

The Noisy Channel Model and Markov Models

What s so Hard about Natural Language Understanding?

PRESS STATEMENT NOORDUNG ONE / JUN 2017 ONLY FOR PRESS USAGE

Clustering analysis of vegetation data

Text Analysis. Week 5

Deep Learning. Language Models and Word Embeddings. Christof Monz

S-ID Measuring Variability in a Data Set

Additional Test Scores. Test Scores Cross Test Scores Subscores Now let's look at what to do next: Areas of Strength

Inferring a Relax NG Schema from XML Documents

Coarse to Fine Grained Sense Disambiguation in Wikipedia

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction

CSC 4510 Machine Learning

TUTORIAL: HYPER-HEURISTICS AND COMPUTATIONAL INTELLIGENCE

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Synthetic logic characterizations of meanings extracted from large corpora

Special Topics in Computer Science

Generating Sentences by Editing Prototypes

INTERACTIVE INTERACTIVE NOTEBOOK NOTEBOOK VOCABULARY VOCABULARY GRADES

Towards a Probabilistic Model for Lexical Entailment

Parametrized Genetic Algorithms for NP-hard problems on Data Envelopment Analysis

Theoretical approach to urban ontology: a contribution from urban system analysis

Deep Learning Basics Lecture 10: Neural Language Models. Princeton University COS 495 Instructor: Yingyu Liang

A Discrete Artificial Regulatory Network for Simulating the Evolution of Computation

Matt Heavner CSE710 Fall 2009

Automated Biomedical Text Fragmentation. in support of biomedical sentence fragment classification

Accelerating Decoupled Look-ahead to Exploit Implicit Parallelism

Genetic Engineering and Creative Design

Koza s Algorithm. Choose a set of possible functions and terminals for the program.

Bernoulli and Binomial Distributions. Notes. Bernoulli Trials. Bernoulli/Binomial Random Variables Bernoulli and Binomial Distributions.

Information Extraction from Biomedical Text. BMI/CS 776 Mark Craven

Seminar in Semantics: Gradation & Modality Winter 2014

TechKG: A Large-Scale Chinese Technology-Oriented Knowledge Graph

Name: Period: - Unit 2: Solving Equations. o One and two step equations: o Multi-step equations : o Multi-Step equations with fractions: /20

NUL System at NTCIR RITE-VAL tasks

GEIR: a full-fledged Geographically Enhanced Information Retrieval solution

WordNet Lesk Algorithm Preprocessing. WordNet. Marina Sedinkina - Folien von Desislava Zhekova - CIS, LMU.

Natural Language Processing. Statistical Inference: n-grams

Computers playing Jeopardy!

Semantics and Generative Grammar. A Little Bit on Adverbs and Events

Comparing whole genomes

Extraction of Opposite Sentiments in Classified Free Format Text Reviews

Textual Entailment as a Directional Relation

Information Extraction from Biomedical Text

THE LANGUAGE OF FUNCTIONS *

Language Models. CS6200: Information Retrieval. Slides by: Jesse Anderton

Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Spatial Role Labeling

Table of content. Understanding workflow automation - Making the right choice Creating a workflow...05

Structural Analysis of Word Collocation Networks

N U R T U R I N G E A R L Y L E A R N E R S

Calculating Semantic Relatedness with GermaNet

Data Warehousing & Data Mining

c(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a)

Language Modeling. Michael Collins, Columbia University

Features of Statistical Parsers

Evolutionary Computation

N-gram N-gram Language Model for Large-Vocabulary Continuous Speech Recognition

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011

7 + Entrance Examination Sample Paper English Total marks: 47 Time allowed: 45mins

Bounded Approximation Algorithms

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

A Genetic Algorithm Approach for Doing Misuse Detection in Audit Trail Files

Middle School Science Lab Essentials. Jane Lee-Rhodes NBCT

A Context-Free Grammar

Learning Features from Co-occurrences: A Theoretical Analysis

WEST: WEIGHTED-EDGE BASED SIMILARITY MEASUREMENT TOOLS FOR WORD SEMANTICS

Lecture 7: Word Embeddings

Parallelogram Area Word Problems

Lecture 9 Evolutionary Computation: Genetic algorithms

L I T T L E L I O N C H A L L E N G E

Inducing Polynomial Equations for Regression

The Language Modeling Problem (Fall 2007) Smoothed Estimation, and Language Modeling. The Language Modeling Problem (Continued) Overview

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises)

The representation of word and sentence

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI

Lesson 3 Acceleration

XCSF with Local Deletion: Preventing Detrimental Forgetting

Extracting and Analyzing Semantic Relatedness between Cities Using News Articles

Transcription:

Automated Slogan Production Using a Genetic Algorithm Polona Tomašič* XLAB d.o.o. Pot za Brdom 100 1000 Ljubljana, Slovenia (polona.tomasic@gmail.com) Gregor Papa* Jožef Stefan Institute Jamova cesta 39 1000 Ljubljana, Slovenia (gregor.papa@ijs.si) Martin Žnidaršič* Jožef Stefan Institute Jamova cesta 39 1000 Ljubljana, Slovenia (martin.znidarsic@ijs.si) *Affiliated also to the Jožef Stefan International Postgraduate School, Jamova cesta 39, 1000 Ljubljana, Slovenia The Student Workshop on Bioinspired Optimization Methods and their Applications (BIOMA 2014) 13 September 2014, Ljubljana, Slovenia

Introduction Slogan generation field of Computational Creativity. The state of the art: The BRAINSUP framework for creative sentence generation (Özbal, 2013): User provides keywords, domain, emotions, Beam search through the search space of possible slogans. Our method aims at a completely autonomous approach: User provides only a short textual description of the target entity. Based on a genetic algorithm. Follows the BRAINSUP framework in the initial population generation phase. Uses a collection of heuristic slogan functions in the evaluation phase.

Resources The database of existing slogans.

Resources The database of frequent grammatical relations between words in sentences, along with the part-of-speech tags. Stanford Dependencies Parser s output for the sentence Jane is walking her new dog in the park

Resources The database of slogan skeletons existing slogans without the content words. Skeleton

Slogan Generation INPUT: a textual description of a company or a product and the algorithm parameters OUTPUT: a set of generated slogans

Keywords and Entity Extraction Keywords: the most frequent nonnegative words in the input text. Entity: the most frequent entity in the input text. Keywords and entity extracted from the Coca-Cola Wikipedia page.

Initial Population Based on the BRAINSUP framework, with some modifications and additions.

Evaluation An aggregate evaluation function, composed of 9 sub-functions: Bigram function Length function Diversity function Entity function Keywords function Word frequency function Polarity function Subjectivity function Semantic relatedness function Changing the weights for tuning the outputs.

Production of a New Generation 10% elitism. 90% roulette wheel. Crossover with a probability p crossover. Mutation with a probability p mutation. Deletion of similar slogans. Random seeds if necessary.

Production of a New Generation Crossover Mutation Small crossover: Big crossover: Small mutations: replacement of a word with its synonym, antonym, meronym, hyponym, hypernym, or holonym. Big mutations: deletion of a word, addition of an adjective or an adverb, replacement of a word with another random word with the same partof-speech tag.

Deletion of Similar Slogans Removing duplicate slogans. Similar slogans: removing the one with the lower evaluation score. Preventing quick convergence of slogans.

Algorithm parameters: Experiments Input: a textual description of Coca-Cola, obtained from the Wikipedia. Weights: [bigram: 0.22, length: 0.03, diversity: 0.15, entity: 0.08, keywords: 0.12, frequent words: 0.1, polarity: 0.15, subjectivity: 0.05, semantic relatedness: 0.1] p crossover = 0,8 (p crossover_big = 0.4, p crossover_small = 0.2, p crossover_both = 0.4) p mutation = 0.7 (p mutation_small = 0.8, p mutation_big = 0.2) Number of iterations of genetic algorithm: 150 Number of runs of the algorithm for the same input parameters: 20 The population size: 25, 50 and 75

Results

Results

Example of final slogans Size of population: 25 1. Love to take the Coke size (0.906) 2. Rampage what we can take more (0.876) 3. Love the man binds the planetary Coke (0.870) 4. Devour what we will take later (0.859) 5. You can put the original Coke (0.850) 6. Lease to take some original nose candy (0.848) 7. Contract to feast one s eyes the na keep (0.843) 8. It ca taste some Coke in August (0.841) 9. Hoy despite every available larger be farther (0.834) 10. You can love the simple Coke (0.828)

Conclusion and Further Work Genetic algorithm ensures the increase of slogan scores with every new generation. Current method can be useful for brainstorming. Plenty of room for improvement. Further work: refinement of the evaluation functions, correction of grammatical errors, machine learning for computing the weights of the evaluation functions, adaptive calculation of control parameters for genetic algorithm,...

Thank you